...
Expand | ||
---|---|---|
| ||
You can find a basic memory requirements calculator at: https://secure.mccombs.utexas.edu/public/datasetmemorysizer/default.aspx
|
Expand | ||||||
---|---|---|---|---|---|---|
| ||||||
Keep in mind that you do not just need to determine the byte size of columns in your initial data set. That just determines how much memory you need to load the initial data set. If you will use that data in calculations, you will need to calculate the data types and memory requirements of the resulting data set. For example, consider a very simple data set with just two one row and two columns (A and B), each with a 2-byte number. If you want to add a third column (C) that involves a calculation of the first two columns, the data type requirements of column C will depend on the type of calculation you are performing.
For example, the if the value in colum C below is the result of a caluclation involving the values in columns A and B, the exact value of C depends on what kind of caluclation that is (addition, subtraction, multiplication, etc.). Also, the number of bytes required by C depends on how large that result might be.
In the real world, your data sets and calculation won't be this simple, but you would still apply the same principal: when creating a new column based on values of existing columns, you need to determine what the the largest possible result might be. When you have a choice in selecting data types, choose one with a byte size large enough for what you need, but no larger than that. If you don't have a choice, because your application enforces its own default rules for byte sizes (such as the case with R), then just be aware of what those byte sizes are and plan accordingly.
|
...