On this blog, I edit drafts in a local text editor, not in Blogger CMS, and more often update posts than post new ones. To simplify the updating process, I create a Bash script that automatically extracts specific elements from local HTML files and sends PATCH requests with them through the Blogger API 3.0.
This post assumes the following local HTML files:
- Each
div.post
element in a file:- It has an empty line before and after it to distinguish its changes.
- The value of its
id
attribute is the known Blogger resource (page or post) ID. - Its start tag is a single line.
- It has paragraphs in its post body.
- Git tracks changes in these files.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Document Title</title>
</head>
<body>
<div class="post" id="RESOURCE_ID">
<h3>Post Title</h3>
<div class="post-body">
<p>The first paragraph.</p>
<p>The second paragraph.</p>
</div>
</div>
<div class="post" id="RESOURCE_ID">
<h3>Post Title</h3>
<div class="post-body">
<p>The first paragraph.</p>
<p>The second paragraph.</p>
</div>
</div>
</body>
</html>
Detect Local Changes using Git and Specify Elements
First, detect changed local HTML files individually using git diff with the --name-only
option. For each file, extract line numbers of differences from chunk headers starting with the @
characters.
For each line number, specify the address range up to its number and examine a resource ID among the start tags of div.post
elements having an id
attribute. Then, assign the nearest one to the line to a resource_id
variable because this element contains the difference.
If the length of the resource_id
variable is zero, the div.post
element does not have an id
attribute. If the value is equal to the previous one, the difference exists within the same element as the previous one. Ignore these cases.
id_regex='^.*<div(.*\s+id="([0-9]+)")?.*\s+class="post"(.*\s+id="([0-9]+)")?.*$'
id_replacement='\2\4'
for file in $(git diff --name-only); do
for line_number in $(git diff -U0 "$file" |
sed -nE 's/^@@+ -[0-9]+(,[0-9]+)? \+([0-9]+)(,[0-9]+)? @@+.*$/\2/p'); do
resource_id=$(sed -nE "1,${line_number}s/$id_regex/$id_replacement/p" \
"$file" |
tail -1) || exit
if [ ! -z "$resource_id" ] &&
[ "$resource_id" != "$previous_resource_id" ]; then
# Extract the post title and body.
fi
previous_resource_id=$resource_id
done
done
Extract Specified Elements using XML Parser
Extract the post title from the file corresponding to the resource ID specified in the previous section. I use xmlstarlet to extract the contents of these elements.
xmlstarlet fo -H "$file" 2>/dev/null |
xmlstarlet sel -t -c "//div[@id=\"$resource_id\"]/h3/node()"
Similarly, extract the post body and modify it if necessary. Then, pass it to jq as a single long string to escape reserved characters in JSON. In the following example, I add the “Read more” link of Blogger after the first paragraph.
xmlstarlet fo -H "$file" 2>/dev/null |
xmlstarlet sel -t -c \
"//div[@id=\"$resource_id\"]/div[@class=\"post-body\"]/node()" |
sed -E "0,/<\/p>/s//<\/p>\n\n<!-- more -->/" |
jq -sR
Send PATCH Requests through Blogger API
Next, send PATCH Requests using curl through the Blogger API for the extracted contents in the previous section. Because the following methods of the API use OAuth 2.0 for authorization, prepare an access token as described in my preceding post.
The API does not seem to update draft posts. Accordingly, list them using the list
method and parameters. Then, extract their resource IDs to examine the status of the updating post.
curl -H "Authorization: Bearer $access_token" -X GET \
"https://www.googleapis.com/blogger/v3/blogs/$blog_id/posts?status=draft" |
jq -r .items[].id |
grep -q ^$resource_id$
If the status is draft, temporarily publish the post using the publish
method.
curl -H "Authorization: Bearer $access_token" -X POST https://www.googleapis.com/blogger/v3/blogs/$blog_id/posts/$resource_id/publish
Now, send the data of the element using the patch
method. I decided to send a post title along with a post body because it is occasionally updated and much shorter than a post body. Note that the Blogger API does not appear to support a description
property.
curl -d "{\"title\": \"$title\", \"content\": $content}" \
-H "Authorization: Bearer $access_token" \
-H 'Content-Type: application/json; charset=utf-8' -X PATCH \
https://www.googleapis.com/blogger/v3/blogs/$blog_id/posts/$resource_id
If the post is temporarily published as described above, revert it to the draft status using the revert
method.
curl -H "Authorization: Bearer $access_token" -X POST https://www.googleapis.com/blogger/v3/blogs/$blog_id/posts/$resource_id/revert
Create Commit using Git
Finally, stage the local files and create a new commit using git commit.
git commit -a -m "Update $resource_id"
Bash Script Example
Combining the above processes, you can update posts without manually copying and pasting changes from local HTML files into Blogger CMS. In a real-world application, I have published the update_blogger.sh
Bash script on GitHub.
No comments:
Post a Comment