Etech > The Data Dump > Fun With Graphs and Charts

David Sifry – state of the blogosphere
 
Mar 2006 over 100,000 blogs created daily
new weblog every second
50% of new bloggers still posting 3 months later
10% of all blogs update weekly or more
about 9% of new blogs are spam w spikes as high as 25%
 
60% pings from known spam sources
technorati blocks most of these now
 
daily posting volume
1.2 million legitimate posts/day
about 50,000 postings per hour
 
Blogs vs mainstream media, some blogs are up there now replacing traditional media
 
language breakdown – jap 41% – english 28% – chinese 14%, others all 4 or less
 
Japanese posts tend to be shorter so more posts
 
almost one half of blog posts use tags or categories
 
over 81 million tagged posts, growing 400k/day
 
about 24% of all posts use rel=”tag” microformat (only created 1 yr 3 months ago)
 
More than search – it’s about exposing community and context
 
Eric ? – CTO Feedburner
 
showing us feedstorm – visualisation of hits after launch, going from a few drops to lots punctuated by events like launch of iTunes podcasts which makes the screen go made, a quick drill of what the visual is representing.
 
position is random
size is size of feed/circulation
colours are types (text vs podcast ex)
 
200,000 feeds managed now
 
Adam Messinger – Gauntlet Systems
 
next gen continuous integration system
no more broken builds or smoke tests
comprehensive analytics
 
Lotka’s Law – in 1926 observed 80/20 rules relationship in scientific productivity
posit is that this applies to development, as in the few most productive devlelopers account for majority of code
 
Three open source projects that have checked into their system
 
ActiveMQ – two guys do 80% work, the remaining 10 or so do the rest
Lucene – flatter curve, three of ten do majority of the work (but it’s mature product)
Hibernate EjB3 – most work done by two with only 5 others doing the rest (commercial)
 
Conclusion
true that few developers do most of the work
 
Andy Edmonds – Windows Live Search
 
Data Dump, ETech 06
 
year in review – the attention of the world is left in traces of the internet
most popular places looked for in virtual earth – 1000 top queries
a visual of web queries
 
Roger Magoulas – O’Reilly
 
AJAX/Ruby trends – normalisation – how to spot emerging trends
 
Lots of graphs tracking trends through number of Google pages referencing the term
 
Ian Kallen – Technorati
 
Web pages lie but the stats don’t
 
The Metrics of the Dark Underbelly
blog spam facts from the guy building Technorati while it flies…
 
They tap the stream of data to find spam
 
They expose domains that are known spammers
 
looking for concentration of blogs around things like clusters on a domain and Adsense ID for example
 
again lots of stats on blogspot , but not because of the service but because it is free
 
? – AttentionTrust.org
 
a run through of setting up a vault for your clickstream, nothing new here
 
the sessions run over and chaos as there is supposed to be another session starting and people are leaving and arriving and pushing past
 
David Hornik – August Capital
 
Sept 2005 to + Feb 2006
 
A funny skit of email breakdown
 
most common mail 976 = Introduction
 
themes around taking breaks and not doing work, nice break from the norm but to be truthful it’s an indication of the old spirit of ETech and just a bit of harmless fun

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: