Get an array containing the number of posts created in the last 2 weeks

Question

Get an array containing the number of posts created in the last 2 weeks

I want to create spark lines that illustrate the number of posts posted to my blog in the last 2 weeks. To do this, I first need to create an array that contains the number of messages created on each day during the period in question.

For example, this array:

[40, 18, 0, 2, 39, 37, 22, 25, 30, 60, 36, 5, 2, 2]

generates this spark line: (I use the Googlecharts wrapper around the Google Chart API )

chart? chco = orange & chd = s: YLABYWNPSlWDBB & cht = ls & chs = 120x60 & chxr = 0.40,100

My question is how to create these arrays. Here's what I'm doing now: (I use Searchlogic to execute queries, but this should be clear, never used it)

  history = [] 14.downto(1) do |days_ago| history.push(Post.created_at_after((days_ago + 1).day.ago.beginning_of_day).created_at_before((days_ago - 1).days.ago.beginning_of_day).size) end

This approach is ugly and slow - there must be a better way!

+4

ruby-on-rails google-visualization charts reporting sparklines

Tom lehman Mar 2 '10 at 21:25

source share

5 answers

You need to index your data correctly, otherwise it will not work efficiently. If you use day granularity, then the presence of the Date column pays. Then you can use the standard SQL GROUP BY operation to get the values you need.

For example, migration can be performed as follows:

 self.up add_column :posts, :created_on_date add_index :posts, :created_on_date execute "UPDATE posts SET created_on_date=created_at" end

Then the search is very fast, since it can use the index:

 def sparkline_data self.class.connection.select_values(" SELECT created_on_date, COUNT(id) FROM posts WHERE created_on_date>DATE_SUB(UTC_TIMESTAMP(), INTERVAL 14 DAY) GROUP BY created_on_date ").collect(&:to_i) end

Keep in mind, if you may be absent during the day, you will have to consider this by inserting a null value in your results. The date returns here, so you should be able to calculate the missing values and fill them out. This is usually accomplished by repeating for several days using a collection.

When you need to quickly get a thin piece of data, loading model instances will always be a huge bottleneck. Often you need to go directly to SQL if there is no easy way to get what you need.

+1

tadman Mar 2 '10 at 21:36

source share

Try the following:

 n_days_ago, today = (Date.today-days_ago), Date.today # get the count by date from the database post_count_hash = Post.count(:group => "DATE(created_at)", :conditions => ["created_at BETWEEN ? AND ? ", n_days_ago, today]) # now fill the missing date with 0 (n_days_ago..today).each{ |date| post_count_hash[date.to_s] ||=0 } post_count_hash.sort.collect{|kv| kv[0]}

Note 1 : If you add an index to created_at , this method should scale well. If you run millions of records every day, then you better keep the message per day in a different table.

Note 2 : You can cache and age results to improve performance. On my system, I usually set TTL 10-15min.

+1

Harish shetty Mar 2 '10 at 21:41

source share

In addition to the tadmeni answer, if you have the necessary administrator access, you can study the separation by date, especially if you receive an extremely large volume of messages per day.

0

newdayrising Mar 2 '10 at 21:40

source share

Most of the time spent fulfills 14 database queries, each of which must check each row of the table to check the date (if you do not index create_at).

To minimize this, we can make one database query to grab the corresponding rows and then sort them.

 history = [] 14.times { history << 0 } recent_posts = Post.created_at_after(14.days.ago.beginning_of_day) recent_posts.each do |post| history[(Date.today - post.created_at.to_date).to_i] += 1 end

I also recommend that you add an index, for example, recommended by tadman, but in this case, in the created_at field in the posts table.

0

Edward anderson Mar 2 '10 at 21:59

source share

Alex reisner · Accepted Answer · 2010-03-02T21:49:54+0000

This will give you message hash mapping dates:

 counts = Post.count( :conditions => ["created_at >= ?", 14.days.ago], :group => "DATE(created_at)" )

Then you can turn this into an array:

 counts_array = [] 14.downto(1) do |d| counts_array << (counts[d.days.ago.to_date.to_s] || 0) end

Get an array containing the number of posts created in the last 2 weeks

More articles: