This project addresses a long-needed effort to bring together in one database the world-wide museum holdings on the fishes of Texas. Before this project, museum data were only available from many disparate and often hard to find sources located in several countries and managed in various incompatible databases. Some of these museums have no digital record of their collections and have paper ledgers only. Many are small museums that do not offer their data online. Some have no catalog at all, except what is recorded on jar labels. Extensive efforts were made to find, format and compile data from these museums into one database.
Finding, reformatting and merging data from these sources was a critical task. But we have made those merged data even more useful by doing considerable editing and clean-up. Museums vary considerably in how data are managed. Many rarely update their databases as taxonomy changes or examine specimens as new information is learned. Spelling mistakes and other typographical errors are common among all data fields in most museums. These problems make useful queries difficult to impossible. Without addressing these issues, as we have now done, these data would only be moderately useful.
We have attempted to georeference all records from the state, although some could not be georeferenced due to locality descriptions that were vague or had internal conflicts. We synonymized the taxonomy and in many cases, we have examined specimens to verify identifications. We have extended the range of some species from what was once thought, based on this very basic work. In addition, we have formatted and edited dates and collector names. These additions make the database more useful since data can now be queried on many fields including geography and taxonomy.
This project is timely since more and more large complementary data sets are becoming available online and new tools for complex data analyses are becoming available. To date, much of what is thought to be known about Texas fish distributions is based on anecdote or publications with identifications that cannot be verified. We believe it is difficult to overstate the utility of this high-quality database. Before this project, researchers would not have been able to find the records that we now provide in an easily searchable database. Even if researchers did find records, all of the error checking and verification steps that we have done would still be needed. Those steps benefited greatly by the sheer number of records that we have since some of our data editing steps relied on content of other records in the database. We now provide this database to researchers and the public so they can peruse it and use it for their own research interests.