This is a mirror of the Bibutils web site by Chris Putnam. Please see the original site for the most recent updates.
Bibutils
bibliography conversion utilities
description  
 

The bibutils program set interconverts between various bibliography formats using a common MODS-format XML intermediate. For example, one can convert RIS-format files to Bibtex by doing two transformations: RIS->MODS->Bibtex. By using a common intermediate for N formats, only 2N programs are required and not N²-N. These programs operate on the command line and are styled after standard UNIX-like filters.

I primarily use these tools at the command line, but they are suitable for scripting and have been incorporated into a number of different bibliographic projects.



MODS  
 

The XML intermediate is the Library of Congress's Metadata Object Description Schema (MODS) version 3.1. This is a very flexible standard that should prove quite useful as the number of tools that directly interact with it increase. For other programmers working on tools for working with MODS, I've written a quick introduction.



program tools  
 
  • bib2xml - convert BibTeX to MODS XML intermediate
  • copac2xml - convert COPAC format references to MODS XML intermediate
  • end2xml - convert EndNote (Refer format) to MODS XML intermediate
  • endx2xml - convert EndNote XML to MODS XML intermediate
  • isi2xml - convert ISI web of science to MODS XML intermediate
  • med2xml - convert Pubmed XML references to MODS XML intermediate
  • modsclean - a MODS to MODS converter for testing puposes mostly
  • ris2xml - convert RIS format to MODS XML intermediate
  • xml2ads - convert MODS XML intermediate into Smithsonian Astrophysical Observatory (SAO)/National Aeronautics and Space Administration (NASA) Astrophyics Data System or ADS reference format (converter submitted by Richard Mathar)
  • xml2bib - convert MODS XML intermediate into BibTeX
  • xml2end - convert MODS XML intermediate into format for EndNote
  • xml2isi - convert MODS XML intermediate to ISI format
  • xml2ris - convert MODS XML intermediate into RIS format
  • xml2wordbib - convert MODS XML intermediate into Word 2007 bibliography format


  • downloads  
      Version 3   (history)

  • bibutils_3.40_i386.tgz - x86 Linux binaries
  • bibutils_3.40_win.zip - Windows binaries
  • bibutils_3.40_osx.tgz - MacOSX binaries
  • bibutils_3.35_osx_universal.tgz for Mac Intel (thanks to Joseph Slater)
  • bibutils_3.40_src.tgz - C source code

    The older Bibutils version 2 generates a non-standard XML intermediate that isn't as useful as MODS. I'm still keeping it available; however, I encourage all users to migration to the latest version 3 release.

    For other programmers working on tools for working with MODS, I've written a quick introduction.

    Starting with the version 3.15 release, the programs have been reorganized into a nice library for being plugged into other progjects. Documentation for the library API will be available shortly.



  • MODS flags  
     

    Several flags available for the end2xml, endx2xml, bib2xml, ris2xml, med2xml, and copac2xml programs:

    
    -h, --help                 display help
    
    -v, --version              display version
    
    -a, --add-refcount         add "_#", where # is reference count to reference id
    
    -s, --single-refperfile    put one reference per file name by the reference number
    
    -i, --input-encoding       interpret the input file as using the requested 
                                 character set (use w/o argument for current list 
                                 derived from character sets at www.kostis.net)
                                 unicode is now a character set option
    
    -u, --unicode-characters   encode unicode characters directly in the file 
                                 rather than as XML entities
    
    -un,--unicode-no-bom       as -u, but don't include a byte order mark
    
    -d, --drop-key             don't put citation key in the mods id field
    
    -c, --corporation-file     with argument specifying a file containing a list
                                 of corporation names to be placed in 
                                 <name type="corporate"></name> instead
                                 of type="personal" and eliminate name mangling
    
    --verbose                  verbose output
    
    --debug                    very verbose output (mostly for debugging)
    


    bib2xml  
     

    bib2xml converts a BibTeX-formatted reference file to an XML-intermediate bibliography file. Specify file(s) to be converted on the command line. Files containing BibTeX substitutions strings should be specified before the files where substitutions are specified (or in the same file before their use). If no files are specified, then BibTeX information will be read from standard in.

    bib2xml BibTeX_file.bib > output_file.xml
    


    copac2xml  
     

    copac2xml converts a COPAC formatted reference file to a MODS XML-intermediate bibliography file.

    end2xml  
     

    end2xml converts a text endnote-formatted reference file to an XML-intermediate bibliography file. This program will not work on the binary library; the file needs to be exported first.

    Endnote tagged formats ("Refer" format export) look like:

    %0 Journal Article
    %A C. D. Putnam
    %A C. S. Pikaard
    %D 1992
    %T Cooperative binding of the Xenopus RNA polymerase I 
     transcription factor xUBF to repetitive ribosomal gene enhancers
    %J Mol Cell Biol
    %V 12
    %P 4970-4980
    %F Putnam1992
    

    There are very nice instructions for making sure that you are properly exporting this at http://www.sonnysoftware.com/endnoteimport.html

    Usage for end2xml is the same as bib2xml.

    end2xml endnote_file.end > output_file.xml
    


    endx2xml  
     

    endx2xml converts a EndNote-XML exported reference file to a MODS XML-intermediate bibliography file. This program will not work on the binary library; the file needs to be exported first.

    isi2xml  
     

    isi2xml converts an ISI-web-of-science-formatted reference file to an XML-intermediate bibliography file.

    Usage for isi2xml is the same as bib2xml.

    isi2xml input_file.isi > output_file.xml
    


    ris2xml  
     

    ris2xml converts a RIS-formatted reference file to an XML-intermediate bibliography file. ris2xml usage is as end2xml and bib2xml

    ris2xml ris_file.ris > output_file.xml
    


    xml2bib  
     

    xml2bib converts the MODS XML bibliography into a BibTeX-formatted reference file. xml2bib usage is as for other tools

    xml2bib xml_file.xml > output_file.bib
    

    Starting with 3.24, xml2bib output uses lowercase tags and mixed case reference types for better interaction with Emacs. The older behavior with all uppercase tags/reference types can still be generated using the command-line switch -U/--uppercase.

    Command line options:

    • -v, --version ; report version information
    • -h, --help ; report help
    • -fc, --finalcomma ; add final comma in the BibTeX output for those that want it
    • -sd, --singledash ; use one dash instead of two (longer dash in latex) between numbers in page output
    • -b, --brackets ; use brackets instead of quotation marks around field data
    • -w, --whitespace ; add beautifying whitespace to output
    • -s, --single-refperfile ; put one reference per file name by the reference number
    • -o, --output-encoding ; interpret the input file as using the requested character set (use w/o argument for current list derived from character sets at www.kostis.net) unicode is now a character set option
    • -U, --uppercase ; use all uppercase for tags (field names) and reference types (pre-3.24 behavior)
    • -sk, --strictkey ; ensure only alphanumeric characters are used in BibTeX reference keys

    Default Output Final Comma
    @Article{Putnam1992, 
    author="C. D. Putnam
    and C. S. Pikaard",
    year="1992",
    month="Nov",
    title="Cooperative binding of the 
    Xenopus RNA polymerase I transcription 
    factor xUBF to repetitive ribosomal 
    gene enhancers",
    journal="Mol Cell Biol",
    volume="12",
    pages="4970--4980",
    number="11"}
    
    @Article{Putnam1992,
    author="C. D. Putnam
    and C. S. Pikaard",
    year="1992",
    month="Nov",
    title="Cooperative binding of the 
    Xenopus RNA polymerase I transcription 
    factor xUBF to repetitive ribosomal 
    gene enhancers",
    journal="Mol Cell Biol",
    volume="12",
    pages="4970--4980",
    number="11",}
    
    Single Dash Whitespace
    @Article{Putnam1992,
    author="C. D. Putnam
    and C. S. Pikaard",
    year="1992",
    month="Nov",
    title="Cooperative binding of the 
    Xenopus RNA polymerase I transcription 
    factor xUBF to repetitive ribosomal 
    gene enhancers",
    journal="Mol Cell Biol",
    volume="12",
    pages="4970-4980",
    number="11"}
    
    @Article{Putnam1992,
      author =      "C. D. Putnam
                    and C. S. Pikaard",
      year =        "1992",
      month =       "Jan",
      title =       "Cooperative binding of 
    the Xenopus RNA polymerase I transcription 
    factor xUBF to repetitive ribosomal gene 
    enhancers",
      journal =     "Mol Cell Biol",
      volume =      "12",
      pages =       "4970--4980"
    }
    
    Brackets Uppercase
    @Article{Putnam1992,
    author={Putnam, C. D.
    and Pikaard, C. S.},
    title={Cooperative binding of the Xenopus 
    RNA polymerase I transcription factor xUBF 
    to repetitive ribosomal gene enhancers},
    journal={Mol Cell Biol},
    year={1992},
    month={Nov},
    volume={12},
    number={11},
    pages={4970--4980}
    }
    
    @ARTICLE{Putnam1992,
    AUTHOR="Putnam, C. D.
    and Pikaard, C. S.",
    TITLE="Cooperative binding of the Xenopus
    RNA polymerase I transcription factor xUBF
    to repetitive ribosomal gene enhancers",
    JOURNAL="Mol Cell Biol",
    YEAR="1992",
    MONTH="Nov",
    VOLUME="12",
    NUMBER="11",
    PAGES="4970--4980"
    }
    



    xml2ris  
     

    xml2ris converts the MODS XML bibliography to RIS-formatted bibliography file. xml2ris usage is as for other tools

    xml2ris xml_file.xml > output_file.ris
    


    Command line options:

    • -v, --version ; report version information
    • -h, --help ; report help
    • -s, --single-refperfile put one reference per file name by the reference number
    • -o, --output-encoding interpret the input file as using the requested character set (use w/o argument for current list derived from character sets at www.kostis.net) unicode is now a character set option

    xml2end  
     

    xml2end converts the MODS XML bibliography to tagged Endnote (refer-format) bibliography file. xml2end usage is as for other tools

    xml2end xml_file.xml > output_file.end
    


    Command line options:

    • -v, --version ; report version information
    • -h, --help ; report help
    • -s, --single-refperfile put one reference per file name by the reference number
    • -o, --output-encoding interpret the input file as using the requested character set (use w/o argument for current list derived from character sets at www.kostis.net) unicode is now a character set option

    xml2wordbib  
     

    xml2wordbib converts the MODS XML bibliography to Word 2007-formatted XML bibliography file. xml2word usage is as for other tools

    xml2wordbib xml_file.xml > output_file.word.xml
    


    Command line options:

    • -v, --version ; report version information
    • -h, --help ; report help
    • -s, --single-refperfile put one reference per file name by the reference number
    • -o, --output-encoding interpret the input file as using the requested character set (use w/o argument for current list derived from character sets at www.kostis.net) unicode is now a character set option

    faq  
     

    How do I download the files?

    Files can be saved by right-clicking on the link. This will pull up a context-sensitive menu, from which you should choose "Save Link As..." (or whatever the appropriate item is for your web browser). Simply clicking on the links frequently loads the binary into the browser window. Not terribly useful.

    Downloads on this page are going to be archives of all of the executables (as zipped or tarred/gzipped files depending on the architecture).


    The programs don't run for me. What am I doing wrong?

    If you send me this question, I would immediately have to ask for more information. The follow items address specific problems.

    • "command not found" The message "command not found" on Linux/UNIX/MacOSX systems indicates that the commands cannot be found. This could mean that the programs are not flagged as being executable (run "chmod ugo+x xml2bib" for the appropriate binaries) or the executables are not in your current path (and have to be specified directly like ./xml2bib). A quick web search on chmod or path variables should provide many detailed resources.

    • I'm running MacOSX and I just get a terminal when I double-click on the programs. Simply put, this is not the way to run the programs. You want to run the terminal first and then issue the commands at the command line. It should be under Applications->Utilities->Terminal on most MacOSX systems I've seen. If you just double-click the program, the terminal corresponds to the input to the tool. Not so useful.

      Some links to get you started running the terminal in a standard UNIX-like fashion are at TerminalBasics.pdf [homepage.mac.com], [macdevcenter.com], and [ee.surrey.ac.uk].

      I'm happy to help with specific questions, but the more knowledgable you are the easier it will be to help (and I frankly don't have the time to help everyone learn basic UNIX).

    I am very interested in bug reports and problems in conversions. Feel free to e-mail me if you run into these issues. The absolute best bug reports provide error messages from the operating systems and/or input and outputs that detail the problems. Please remember that I'm not looking over your shoulder and I cannot read your mind to figure out what you are doing--"It doesn't work." isn't a bug report I can help you with.


    You have a MacOSX version, can you give me a MacOS9 version?

    Sorry. I'd like to, but these programs assume a command-line interface with normal standard in, standard out, and stardard error along with command-line arguments. MacOSX is a fundamental change in the operating sysem with a BSD (UNIX-like) core that I'm taking advantage of to provide a MacOSX binary. On the other hand, I don't know that much about MacOS9, and if it is possible to generate a useful binary from these sources let me know.


    This stuff is great, how can I help?

    OK, I actually don't get this question so often, though I have gotten very useful help through people who have willingly sent useful bug reports and sample problematic data to allow me to test these programs. I willingly accept bug reports, patches, new filters, suggestions on program improvements or better documentation and the like. All I can say is that users (or programmers) who contact me with these sorts of things are far more likely to get their itches scratched.



    license  
     

    All versions of bibutils are relased under the GNU Public License (GPL). In a nutshell, feel free to download, run, and modify these programs as required. If you re-release these, you need to release the modified version of the source. (And I'd appreciate patches as well...if you care enough to make the change, then I'd like to see what you're adding or fixing.)


    Chris Putnam, Ph.D.
    cdputnam@scripps.edu
    The Scripps Research Institute
    Last Updated: 4/17/2007