Getting Google Scholar to crawl the site

argeifontesfunes · Setembro 19, 2024, 9:20pm

My repository has only registers in which the main document is a PDF file. I have not been able to get Google Scholar to crawl the site. I have seen a two years-old topic on this ( [Tainacan metadata in the head], but I do not understand the procedures. Is there some configuration that could help me get Google Scholar to recognize the site?

Best regards,

Pedro

mateus.m.luna · Setembro 20, 2024, 2:14pm

Hi @argeifontesfunes!

To be honest, I’m not aware of the process behind Google Scholar indexing. I’m not 100% sure if it actually has it’s own indexing or if it is only a fraction of Google’s search, filtered by well known and dedicated publishing platforms.

In any case, Tainacan does some practices such as adding default <meta tags to items page that can help improve indexing, and using proper heading tags, but you should also follow other practices to improve your SEO, which some plugins can be of aid.

But there are also things that we could do in Tainacan to improve this. Doing some research I found about the Dublin Core meta tags, which we could possibly add as well to Collections that have this mapping set. This is not high in our priority list so far but should be feasible and I’m interested in exploring it ahead. I’ve opened an issue for it:

github.com/tainacan/tainacan

Add dublin core meta information when collection is mapped.

opened 01:27PM - 20 Sep 24 UTC

mateuswetah

[Goal] Interoperability [Effort] Level 2 🤓

In [`add_social_meta()`](https://github.com/tainacan/tainacan/blob/bad21acdffdda…e647fe31c6314ff610ba83254a5/src/classes/theme-helper/class-tainacan-theme-helper.php#L818), we add common html `<meta>` tags to describe an item. In case a collection has it's Dublin Core mapper applied, I believe we should do the same with attributes such as `DC.creator`, `DC.title`, etc.

argeifontesfunes · Setembro 23, 2024, 12:15pm

Thank you, Mateus. We are looking into the problem of adding the required meta tags so that Google Scholar recognize the items for its indexing. Getting the Dublin Core meta tags in the head would be an ideal solution for us.

mateus.m.luna · Setembro 24, 2024, 5:32pm

would you have any developer available to work on this? I can provide some guidance if that is the case, just can’t do it officially the plugin by now.

argeifontesfunes · Setembro 24, 2024, 10:06pm

Dear Mateus:
With a lot of patience and the help of ChatGPT, I think that I was able to solve the problem. I created a Blocksy child theme and included some code in the functions.php file. The meta tags appear at the end of the head sections of the items’ page (see Example).

However, I was not able to make tainacan_get_the_item_document_url to get the url of the main document of the item. Any thoughts. I am attaching the child theme file, in case you want to look at it or if it is of any help to other users.

Best regards,

                    Pedro

blocksy-child.zip (338.2 KB)

mateus.m.luna · Setembro 24, 2024, 11:46pm

Hahhahah amazing!

I love to see people coming up with solutions like that! The path it took is working. The reason why it is a bit complicated to embed the logic in the plugin is exactly the fact that you had to define each metadata ID, which explains why we would possibly do this fetching information from one of our mappers. But, it works fine for your case! My only suggestion would be to check if the value is undefined in order to avoid printing invalid content.

I should point out that it is calling a function that does not exists by default in Tainacan, that get_tainacan_metadata_value. Possibly this is working because you also must have a plugin activated where this function is defined… maybe the one where you were playing with view modes? Just keep this in mind in case you plan to deactivate the plugin in the future. If both solutions will be used always in your case, you should also considering migrating this code that you did for your plugin’s functions.php, since you won’t need a child theme only for that matter.

For anyone looking for this, similar functions would be get_metadata_value, from the Item Metadata Controller and the get_item_metadatum_as_html from the Item class.

Regarding the document, you should call tainacan_get_the_item_document_url( passing the item ID as parameter, not the item object. And if you plan on using tainacan_get_attachment_html_url(, in that case the parameter would be the attachment ID, not the item.

Nice to see that you are finding a way

argeifontesfunes · Setembro 25, 2024, 12:15am

Thank you, Mateus for your claps.

I will try the code in the functions.php file of my plugin. Unfortunately I do not manage to get the main document URL. tainacan_get_the_item_document_url($item_id) returns an empty string.

Best regards,

                    Pedro

argeifontesfunes · Setembro 25, 2024, 12:22am

Correction: tainacan_get_the_item_document_url($item_id) is working. Thank you, Mateus!

Best regards,

                    Pedro

system · Setembro 30, 2024, 12:23am

Este tópico foi fechado automaticamente 5 dias depois da última resposta. Novas respostas não são mais permitidas.

Tópico		Respostas	Visualizações
Indexing by clarivate, scopus, etc Dúvidas metadados	2	193	27 de Outubro de 2023
Ordering Metadata problem Erros	3	61	23 de Junho de 2024
Importing Alternative Metadata standards/cores Suporte metadados , importadores	3	369	21 de Fevereiro de 2022
Can Tainacan automatically import a files metadata after upload? Suporte	3	92	12 de Março de 2024
Uploaded files not displaying when uploaded via Tainacan interface Suporte	16	64	7 de Fevereiro de 2025

Getting Google Scholar to crawl the site

Tópicos relacionados