[TriLUG] Tracking File Downloads

Ken M ken at mack-z.com
Sun Jan 25 13:08:34 EST 2015


No worries, I don't consider it a hijack.  You gave a really good suggestion for me to add to my arsenal.

Cloudflare or not I still want to figure out my own on server solution to this as well though.  But I think the combination well help me improve the confidence I have in report accuracy.

Right now what I am doing for reporting is this.

I have 2 scripts, one meant to be run daily and one hourly.  These scripts filter for the site (multisite install) as they build these working logs.

The daily one build temp logs from zcat reads of the archive into temporary month day and week logs.  Then those logs are run through goaccess to build html reports.

Those daily logs have three modes, Past month, past week, past day.  And there are 2 categories for each mode.  Episode download and full site statistics.  The latter being an unfiltered run through goaccess.

The hourly job just passes the current nginx access log through goaccess filtering only for site and builds a report.

So all of that is working and in drupal I have a content type that only the admin and editors can see for site reports and each one is a quick php script that checks the presence of the report they are to display.  If it is there it puts the last modified date out then includes the html of the report.  If not it displays an error msg.

So far an elegant solution.  I need to set up drupal to alter the template though for the reports as the overall site template kind of gets in the way.  The reports are out of any web root and the pages to include them in drupal are secured so no user without the proper credentials can get to them.

The last thing I need to do in all this is setup a monthly cron job to snapshot each months report to an archive location.

The need for this solution is drupals own site statistics although capturing the event are not the most friendly to view and their collection hits the DB which is a performance concern.  Analytics with drupal can only capture so much and direct file access like podcatchers will use is not one of them.

Ken

On Sun, Jan 25, 2015 at 12:57:50PM -0500, Igor Partola wrote:
> Oh I absolutely understand that. The free CloudFlare plan gives you a TTL of one hour which if you are doing active development can be way too long. Though this won't help you with analytics, they do have a lower level option where they don't cache things at all, but then they also just marshal TCP connections around.
> 
> On the whole, I think a CDN is a very good thing to do for any size site. Let them worry about network capacities, uptime, etc. I am certainly converting all my sites to use it, though slowly.
> 
> I'll stop hijacking this thread now as your actual question still needs to be answered.
> 
> Igor


More information about the TriLUG mailing list