This is a mirror of the Bibutils web site by Chris Putnam. Please see the original site for the most recent updates.
Bibutils v2
bibliography conversion utilities
Back to Bibutils v3

description

The version 2 utilities use a fairly simple XML intermediate. Despite this, they have been used for sometime and are fairly stable. Futher, they include a couple of utilities (isi2xml, med2xml) not yet ported over to version 3. Hence, these programs are still available; however, I encourage all users to start migrating to version 3.

program tools

• bib2xml - convert bibtex to XML intermediate
• end2xml - convert endnote to XML intermediate
• isi2xml - convert ISI format to XML intermediate
• ris2xml - convert RIS format to XML intermediate
• med2xml - convert medline to XML intermediate
• xml2bib - convert XML intermediate into bibtex
• xml2ris - convert XML intermediate into RIS format for Reference Manager
• xml2en - convert XML intermediate into format for EndNote
• xmlreplace - do find/replace functions on specific XML fields
• uniqbib - ensure that automatically generated Bibtex keywords are unique

Version 2   (history)

bib2xml

bib2xml converts a bibtex-formatted reference file to an XML-intermediate bibliography file. Specify file(s) to be converted on the command line. Files containing bibtex substitutions strings should be specified before the files where substitutions are specified (or in the same file before their use). If no files are specified, then bibtex information will be read from standard in.

bib2xml bibtex_file.bib > output_file.xml


end2xml

end2xml converts a text endnote-formatted reference file to an XML-intermediate bibliography file. This program will not work on the binary library; the file needs to be exported first.

Endnote tagged formats look like:

 %0 Journal Article %A C. D. Putnam %A C. S. Pikaard %D 1992 %T Cooperative binding of the Xenopus RNA polymerase I transcription factor xUBF to repetitive ribosomal gene enhancers %J Mol Cell Biol %V 12 %P 4970-4980 %F Putnam1992 

Usage for end2xml is the same as bib2xml.

end2xml endnote_file.end > output_file.xml


isi2xml

isi2xml converts a ISI-formatted reference file to an XML-intermediate bibliography file. isi2xml usage is as ris2xml and bib2xml

isi2xml isi_file.ris > output_file.xml


ris2xml

ris2xml converts a RIS-formatted reference file to an XML-intermediate bibliography file. ris2xml usage is as end2xml and bib2xml

ris2xml ris_file.ris > output_file.xml


med2xml

med2xml generates an XML-style bibliography from Medline data. It is run as for the other programs; however, the proper format for the references can be generated by directly downloading them from PubMed. Simply choose the "Text" display view and select "Send to" button. Take the created file and run through the med2xml filter.

The resulting output should look like the following. Either copy and paste into a new file or save the file using the browser.

xml2bib

xml2bib converts an XML-intermediate into a bibtex-formatted reference file. xml2bib usage is as for other tools

xml2bib xml_file.xml > output_file.bib


Command line options:

• -v, --version ; report version information
• -h, --help ; report help
• -fc, --finalcomma ; add final comma in the bibtex output for those that want it
• -sd, --singledash ; use one dash instead of two (longer dash in latex) between numbers in page output
• -b, --brackets ; use brackets instead of quotation marks around field data
• -w, --whitespace ; add beautifying whitespace to output

 Default Output Final Comma @ARTICLE{Putnam1992, AUTHOR="C. D. Putnam and C. S. Pikaard", YEAR="1992", MONTH="Nov", TITLE="Cooperative binding of the Xenopus RNA polymerase I transcription factor xUBF to repetitive ribosomal gene enhancers", JOURNAL="Mol Cell Biol", VOLUME="12", PAGES="4970--4980", NUMBER="11"}  @ARTICLE{Putnam1992, AUTHOR="C. D. Putnam and C. S. Pikaard", YEAR="1992", MONTH="Nov", TITLE="Cooperative binding of the Xenopus RNA polymerase I transcription factor xUBF to repetitive ribosomal gene enhancers", JOURNAL="Mol Cell Biol", VOLUME="12", PAGES="4970--4980", NUMBER="11",}  Single Dash Whitespace @ARTICLE{Putnam1992, AUTHOR="C. D. Putnam and C. S. Pikaard", YEAR="1992", MONTH="Nov", TITLE="Cooperative binding of the Xenopus RNA polymerase I transcription factor xUBF to repetitive ribosomal gene enhancers", JOURNAL="Mol Cell Biol", VOLUME="12", PAGES="4970-4980", NUMBER="11"}  @ARTICLE{Putnam1992, AUTHOR = "C. D. Putnam and C. S. Pikaard", YEAR = "1992", MONTH = "Jan", TITLE = "Cooperative binding of the Xenopus RNA polymerase I transcription factor xUBF to repetitive ribosomal gene enhancers", JOURNAL = "Mol Cell Biol", VOLUME = "12", PAGES = "4970--4980" } 

xml2ris

xml2ris converts the XML intermediate to RIS-formatted bibliography file. xml2ris usage is as for other tools

xml2ris xml_file.xml > output_file.ris


xml2en

xml2en converts the XML intermediate to tagged Endnote bibliography file. xml2ris usage is as for other tools

xml2ris xml_file.xml > output_file.ris


xmlreplace

xmlreplace finds and replaces XML-tag specific elements within a XML file. It is a UNIX-like filter (accepting input from the command line or standard-in). The file of substitution commands must be the first file specified on the command line.

xmlreplace subs_file.txt ref1.xml ref2.xml > new.xml
cat ref1.xml ref2.xml | xmlreplace subs_file.txt > new.xml


The format of the substitution command file is very simple:

 :XML field:Old text:New text or :XML field:Old text:New text:Comments for example :TITLE:DNA:{DNA} :TITLE:E. coli:{{\it E. coli}}:Add latex formatting :TITLE:[Review]::The two adjacent field separators indicates deletion :JOURNAL:Proc Natl Acad Sci USA:PNAS 

The field separator (a colon, :, in this example) can be any character, so long as it does not occur within the XML field, the old text, or the new text. The program will use the first character on the line as the separator.

The order of the replacements will follow the order in the file. Hence, the following XML string is convered by the substitution file to the output:
 Input DNA replication in Escherichia species especially Escherichia coli  Subs File :TITLE:Escherichia:{{\it Escherichia}} :TITLE:Escherichia coli:{{\it Escherichia coli}}  Output DNA replication in {{\it Escherichia}} species especially {{\it Escherichia}} coli 

The first substitution disrupts the exact match for Escherichia coli due to the insertion of the "}}" characters. The solution is simple: switch the order of the substitutions:

 Subs File :TITLE:Escherichia coli:{{\it Escherichia coli}} :TITLE:Escherichia:{{\it Escherichia}}  Output DNA replication in {{\it Escherichia}} species especially {{\it {{\it Escherichia}} coli}} 

uniqbib

uniqbib resolves duplicate bibtex identifiers, which can frequently occur by autogeneration of bibtex keys and can be difficult to resolve. There are two ways to run uniqbib. The first is to simply scan a bibtex file to search for duplicate identifiers:

uniqbib file.bib


The second way scans a file and then writes out an altered file that has resolves the duplications. This is controlled by specifying a second filename.

uniqbib file.bib new.bib


faq

Check here for help. Files can be saved by right-clicking on the link. This will pull up a context-sensitive menu, from which you should choose "Save Link As..." (or whatever the appropriate item is for your web browser). Simply clicking on the links frequently loads the binary into the browser window. Not terribly useful.

Downloads on this page are going to be archives of all of the executables (as zipped or tarred/gzipped files depending on the architecture).

The programs don't run for me. What am I doing wrong?

• "command not found" The message "command not found" on Linux/UNIX/MacOSX systems indicates that the commands cannot be found. This could mean that the programs are not flagged as being executable (run "chmod ugo+x xml2bib" for the appropriate binaries) or the executables are not in your current path (and have to be specified directly like ./xml2bib). A quick web search on chmod or path variables should provide many detailed resources.

• I'm running MacOSX and I just get a terminal when I double-click on the programs. Simply put, this is not the way to run the programs. You want to run the terminal first and then issue the commands at the command line. It should be under Applications-$gt;Utilities-$gt;Terminal on most MacOSX systems I've seen. If you just double-click the program, the terminal corresponds to the input to the tool. Not so useful.

Some links to get you started running the terminal in a standard UNIX-like fashion are at TerminalBasics.pdf [homepage.mac.com], [macdevcenter.com], and [ee.surrey.ac.uk].

I'm happy to help with specific questions, but the more knowledgable you are the easier it will be to help (and I frankly don't have the time to help everyone learn basic UNIX).

I am very interested in bug reports and problems in conversions. Feel free to e-mail me if you run into these issues. The absolute best bug reports provide error messages from the operating systems and/or input and outputs that detail the problems. Please remember that I'm not looking over your shoulder and I cannot read your mind to figure out what you are doing--"It doesn't work." isn't a bug report I can help you with.

You have a MacOSX version, can you give me a MacOS9 version?

Sorry. I'd like to, but these programs assume a command-line interface with normal standard in, standard out, and stardard error along with command-line arguments. MacOSX is a fundamental change in the operating sysem with a BSD (UNIX-like) core that I'm taking advantage of to provide a MacOSX binary. On the other hand, I don't know that much about MacOS9, and if it is possible to generate a useful binary from these sources let me know.

This stuff is great, how can I help?

OK, I actually don't get this question so often, though I have gotten very useful help through people who have willingly sent useful bug reports and sample problematic data to allow me to test these programs. I willingly accept bug reports, patches, new filters, suggestions on program improvements or better documentation and the like. All I can say is that users (or programmers) who contact me with these sorts of things are far more likely to get their itches scratched.

All versions of bibutils are relased under the GNU Public License (GPL). In a nutshell, feel free to download, run, and modify these programs as required. If you re-release these, you need to release the modified version of the source. (And I'd appreciate patches as well...if you care enough to make the change, then I'd like to see what you're adding or fixing.)

Chris Putnam, Ph.D.
cdputnam@scripps.edu
The Scripps Research Institute
Last Updated: 07/26/04