Saving extremely small values ​​in Amazon Redshift

I create a table in Amazon Redshift using the following command:

 CREATE TABLE asmt.incorrect_question_pairs_unique AS SELECT question1, question2, occurrences, occurrences / (SUM(occurrences)::FLOAT) OVER () AS prob_q1_q2 FROM (SELECT question1, question2, SUM(occurrences) AS occurrences FROM asmt.incorrect_question_pairs GROUP BY question1, question2 HAVING SUM(occurrences) >= 50) 

I also tried the option:

 CREATE TABLE asmt.incorrect_question_pairs_unique AS SELECT question1, question2, occurrences, occurrences::float / SUM(occurrences) OVER () AS prob_q1_q2 FROM (SELECT question1, question2, SUM(occurrences) AS occurrences FROM asmt.incorrect_question_pairs GROUP BY question1, question2 HAVING SUM(occurrences) >= 50) 

I would like the prob_q1_q2 column to be a float column, so I am converting the denominator / numerator to float . But in the summary table, I get all the zeros in this column.

I would like to point out that SUM(occurrences) will be around 10 Billion , so the prob_q1_q2 column will contain extremely small values. Is there a way to keep such small values ​​in Amazon Redshift ?

How to make sure all values ​​in a column are non-zero float ?

Any help would be appreciated.

+5
source share
4 answers

METHOD 1 - I had the same problem! In my case, it was a million rows, so I multiplied the result by 10000. When I wanted to select values ​​from this column, I would divide by 10000 in the select expression to make it even.
I know this is not an ideal solution, but it works for me. METHOD 2 - I created an example table with type Numeric (12.6), and when I imported a result set similar to yours, I can see float values ​​up to 6 decimal precision.

> enter image description here


I think the conversion does not work, when you use the create table AS command, you need to create a table that defines the data type, which ensures that the result set is kept at a certain level of accuracy.
This is strange! how the same choice returns 0.00, but when it is inserted into a table with a forced column, it returns 0.00333.

If Ive made a bad guess, comment on this and I will mix up my answer.

+1
source

Patthebug,

You might be getting too low a number that cannot be saved in Amazon Redshift's FLOAT style. Try using DECIMAL instead, there is no way that it cannot save your value in a 128-bit variable.

How it works: if the value is too large or in your case too small and it exceeds the maximum value of your type, the last digits are truncated, and then the new (truncated) value is stored in the variable / column of your type. When he cuts off a great value that you lose almost nothing, you can say that you cut 20 cents out of 20 billion dollars, you will not do much harm. But in your case, when the number is too small, you can lose everything when it truncates the last digits to match the type (fe type can store up to 5 digits, and you want to store the value 0.000009 in a variable / column of this type. Your value is not can be inscribed in a type so that it can be trimmed from the last two digits so that it can be suitable and you get a new value of 0.0000)

So, if you followed my thought, then a simple change :: :: float to :: decimal should fix your problem. The decimal postscript may require you to specify its size fe decimal (127 100)

+1
source

Try:

select cast(num1 as float) / cast(num2 as float);

This will give you results up to 2 decimal places (default), but will take some processing time. Doing anything else rounds the decimal part.

enter image description here

0
source

You can have up to 38 digits in a DECIMAL / NUMERIC column with 37 digits.

 CREATE TEMP TABLE precision_test (test NUMERIC(38,37)) DISTSTYLE ALL ; INSERT INTO precision_test SELECT CAST( 0.0000000000000000000000000000000000001 AS NUMERIC(38,37)) test ; SELECT * FROM precision_test ; --Returns 0.0000000000000000000000000000000000001 
0
source

Source: https://habr.com/ru/post/1265305/


All Articles