on mendeley
Posted by razor | Filed under English, Szolgáltatás
I almost finished importing and correcting a large number of articles, books, tutorials and other stuff into Mendeley, and I promised on twitter to write about bugs, needed features, etc. In general, the title and author extraction is quite good by now, but there are some problems with journal titles, volume and page numbers, issues and DOIs. Bugs and problems first:
- Journal name extraction can be quite problematic:
- Genome Informatics is usually just Genome In
- Bioinformatics is Bioinformatics (Oxford, England)
- Genome Research appears almost always as Genome Res
- Science is called Science (New York, N.Y.), even after a DOI search
- NAR db and web server issue title extractions give names like Nucleic Acids Res., 32, D138-141
- Author name extraction is bad, even if the full name appears in the text or we have a DOI in the case of Bioinformatics, Genome Informatics, Cell and Genome Research. Sometimes the last 2 or 3 authors of an article are completely missing from NAR articles.
- Page numbers, volume and issue numbers are bad half of the time.
- Genome Research and PNAS DOIs are generally missing, although they are at the end of the first page as a link: ‘Article and publication are at http://www.genome.org/cgi/doi/10.1101/ gr.6902′.
- In the case of Nature News, etc, the metadata extraction almost always fails.
- If the article starts in the middle of a page when you download, like in a Trends in Genetics or Science article, the metadata extraction also fails.
- There are NO periods at the end of article titles. Don’t correct it.
Features I want:
- Maybe some hashing algorithm for finding the same article at different users, and combining/reusing the metadata, which somebody already corrected. This could be really handy in the case of older articles with no DOIs, etc. Ask the Dropbox guys for file hashing, and last.fm for community metadata correction.
- I’d like to add the various supplementary materials (additional pdfs, excel tables, etc) to the articles somehow.
- I would also like to add links. This would be very useful with articles describing web services, online databases and software. I usually go straight to the website after reading the abstract, and never read the full text, only the tutorials, descriptions at the site, or do some test query, download and compile the code, etc.
- Check the DOIs again, when I’m online, after editing something (and adding a DOI) while I’m offline.
- I want an article recommendation engine, and I also want more detailed stats at the website. :)
Tags: annotation, articles, journals, mendeley, metadata, pdf, software
March 15th, 2010 at 19:36
Hi Razor! It’s Mr. Gunn for twitter/friendfeed. I’ve taken the liberty of forwarding your feedback to Mendeley, but I thought I’d let you know that many of your requested features you can actually do now.
Metadata correction/hashing: There’s a hashing that goes on behind the scenes already, but sometimes it fails when an item is present in multiple forms (like a native PDF vs. a scanned PDF) so it doesn’t always work. You may already know this, but one of the founders of last.fm is also on the Mendeley board, so expect this problem to get sorted out soon.
Adding supplementary data: You can do this now. The file box at the bottom of the Document details tab on the right allows you to add .doc, .html, .xls, .tex and many other file types. There’s also a space to add links.
Next to the DOI box is a button, that will do a look-up whenever you request.
On the website, there’s a related items feature on a article’s details page. It’s still under development, so it’s not perfect yet, but it’s improving quickly. Same for the stats.
Thanks again for the detailed review. I’ll be sure that someone from support follows up on this.
March 15th, 2010 at 19:56
Thanks for the comment!
I didn’t know that a last.fm founder is also on the Mendeley board, but now I hope for the best. It was also not entirely clear and obvious that I can add additional files and links but I’ll add them now. :)