We make available all records contained in our database, including not only records that we received that are outside our scope, but records that we suspect to be erroneous for any reason. Those reasons may include a conflict with the scientific literature, incongruence with other data points in our database, comments from original collectors and our users. Correcting these or determining that they are in fact not an error often requires input from our users, especially users involved as collectors or determiners of those records. For this reason we include them, but realize that many users will not want to use these in certain data analyses without at least reviewing them for possible omission. Our query page allows them to be filtered from results.
By far, most of these records are flagged because they are geographic outliers, which we have not yet been able to fully research to a final conclusion. In most cases, specimen examination corrects outliers (by re-determination), but in some cases more thorough research into collectors' field notes, original jar labels, original hand-written museum ledgers and/or direct communication with people involved with collection.
All suspect records are labeled in the website along with our comments regarding details of our research into each record (species identification, collector comments, and original documentation). Suspect records are identified in our website using a color schema defined here.
Error detecting methods
All of our records are assesed species by species in GIS by a group of experts familar with Texas Fishes. ... I think we have this written already....
We are developing tools that spatially evaluate incoming records against a set of generally accepted records that can automatically flag outliers in space or time.
Records can be added to the list anytime for any logical reason and we tend to examine specimens ad hoc as we work in the collections. Suspicions are often addressed immediately when it involves our own specimens and any changes to determinations or other data get into our data update queue.