Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Refine, as a default, displays 10 rows. You can have it display up to 50 but not more (1). Refine is not a tool for modifying data within cells one at a time. It is best used for dealing with whole swaths of data. Refine does that by a tool called 'facet' (2), which is an option you find by clicking on the down-arrow on which ever column you wish to facet. Faceting data is like a filter for selecting data that meets a certain criteria- it can be a word, length of an entry, or just lumping data into how many times it occurs. You can also facet many rows at once, to get a very precise set of data which you can then act on. In the example below, the facet was set to 'text facet' (3). Faceting the column this way shows the data in the cells (4), and how many times that data is used . The column 'Type status' (4) will only have a handful of variety (4 choices in this case). Something like 'Collection Number' would have many- 411 choices (5). Please note that the Facets allow you to sort by name or count. 

Taking a closer look at the "Type Status" facet we see many entries that won't upload into Specify. There are 21 entries that read "mentioned*", and six that say "Figured*". By clicking on the 'edit' option that appears when we hover over that selection, the edit box appears and we can change all 6 entries at once. The number of choices now becomes 3, and the number of entries that say 'Figured' has gone from 21 to 27. We can do the same for the entries 'mentioned*'. Specify expects to see 'referred', not 'mentioned*', so we use this same process to change those records. There are many other columns where this can be done, also- Building (adding 'Building' to 122 and 33), you can also use 'edit' to add things to the blanks, such as adding 'dry' to all the blank entries in 'prep type' and so on. Facet by name also helps identify typos and misspelling ('Texas' has 300 entries, while Texsa has 4)

Faceting by words can help mine data out of comments fields. From the drop down on the column you are going to be mining, select 'Facet' then 'Customized Facets' and from that sub menu, select 'Word facet'. You can also facet for patterns (like lat/long entries in hours, min, seconds) using regex. 

Moving the data is easy- faceting the Comments column for entries with the word 'cast' shows us which entries were noted in their inventory as being 'casts'. We can see in the Text Facet of the Comments column that the word 'cast' is being used to mean a cast of a specimen. We can then see the records in the PrepType Text Facet is blank for those records. Edit the blank cells in bulk using the edit option in the facet display box. Change (blank) to Cast, and apply the change. This process can be done on any type of data. 

When preparing a spreadsheet for upload to Specify, there are many instances where you'll need to combine data from many columns into one, or transpose data from one column into another. Refine makes this very simple, using the 'Transform' option. For example, we are putting the previous storage location information into the Comments section, and renaming this column 'Inventory Remarks' (the field name in Specify). Renaming the columns can be done via the 'Edit Column' drop down, but taking the information from 3 fields and adding it to another field is a little trickier. For this, we will need to combine a few tricks we got from Refine Recipes. The recipe for Merging columns is:

cells["col1"].value + ", " + cells["col2"].value

cells = each cell in whatever is inside the [].

"col1" = the name of first column you want to combine

.value = tells Refine to get the exact value of the cells.

+ = combine this value with
", " = telling Refine to add a comma and a space
+ = combine this value with
cells = (same as above)
"col2" = the name of the second column you want to combine
.value = tells Refine to get the exact value of the cells.

So, it's a coded sentence that is telling Refine "for each cell of this column, take that value and add a ", " to it. Then, take the cells in this other column and add the values to behind the ", ".

First things first though. You are going to want to add a separator in front of any data that is already in the Comments Column. To do this, we'll be using facets to act ONLY on the cells with data in them, and the Transform command to add a character in front of the comments. To do this, we create a facet on the comments section, hover over the (blank) and click on the "include" option. Then select the 'invert' option near the top of the facet title.  Image Added
This 'flips' the selection, and now you should see only the cells with data in them. From the Comments drop down, select Edit Cells and from the sub menu, select Transform. A new window will appear. Adding information is very simple. Make sure the Language is set to the default value, Google Refine Expression Language (GREL). We are going to add a semi-colon and a space before the existing data.  Image Added
Now that we've done that, we can add the previous storage location without ending up with statments in the field that read "SW 269/9Young, Oversized"!

Remember, when we are ADDING data to another column, you have to tell Refine to also add the data from the cells in the column you are working in. 

In practice, it looks like this: Image Added
The same 'recipe' is the foundation for how we take our old database numbering style and make it Specify compatible.