My repository has only registers in which the main document is a PDF file. I have not been able to get Google Scholar to crawl the site. I have seen a two years-old topic on this ( [Tainacan metadata in the head], but I do not understand the procedures. Is there some configuration that could help me get Google Scholar to recognize the site?
To be honest, I’m not aware of the process behind Google Scholar indexing. I’m not 100% sure if it actually has it’s own indexing or if it is only a fraction of Google’s search, filtered by well known and dedicated publishing platforms.
In any case, Tainacan does some practices such as adding default <meta tags to items page that can help improve indexing, and using proper heading tags, but you should also follow other practices to improve your SEO, which some plugins can be of aid.
But there are also things that we could do in Tainacan to improve this. Doing some research I found about the Dublin Core meta tags, which we could possibly add as well to Collections that have this mapping set. This is not high in our priority list so far but should be feasible and I’m interested in exploring it ahead. I’ve opened an issue for it:
Thank you, Mateus. We are looking into the problem of adding the required meta tags so that Google Scholar recognize the items for its indexing. Getting the Dublin Core meta tags in the head would be an ideal solution for us.
would you have any developer available to work on this? I can provide some guidance if that is the case, just can’t do it officially the plugin by now.
Dear Mateus:
With a lot of patience and the help of ChatGPT, I think that I was able to solve the problem. I created a Blocksy child theme and included some code in the functions.php file. The meta tags appear at the end of the head sections of the items’ page (see Example).
However, I was not able to make tainacan_get_the_item_document_url to get the url of the main document of the item. Any thoughts. I am attaching the child theme file, in case you want to look at it or if it is of any help to other users.
I love to see people coming up with solutions like that! The path it took is working. The reason why it is a bit complicated to embed the logic in the plugin is exactly the fact that you had to define each metadata ID, which explains why we would possibly do this fetching information from one of our mappers. But, it works fine for your case! My only suggestion would be to check if the value is undefined in order to avoid printing invalid content.
I should point out that it is calling a function that does not exists by default in Tainacan, that get_tainacan_metadata_value. Possibly this is working because you also must have a plugin activated where this function is defined… maybe the one where you were playing with view modes? Just keep this in mind in case you plan to deactivate the plugin in the future. If both solutions will be used always in your case, you should also considering migrating this code that you did for your plugin’s functions.php, since you won’t need a child theme only for that matter.
For anyone looking for this, similar functions would be get_metadata_value, from the Item Metadata Controller and the get_item_metadatum_as_html from the Item class.
Regarding the document, you should call tainacan_get_the_item_document_url( passing the item ID as parameter, not the item object. And if you plan on using tainacan_get_attachment_html_url(, in that case the parameter would be the attachment ID, not the item.
I will try the code in the functions.php file of my plugin. Unfortunately I do not manage to get the main document URL. tainacan_get_the_item_document_url($item_id) returns an empty string.