Recently in stats Category

A year of stats

| No Comments
2011 was a busy year. I changed jobs and became a moderator on ServerFault, both of which impacted my blogging activity. The former more than the later due to greatly reduced incidences of boredom. What kind of traffic did I get last year? Not a lot, but enough.

  • 40K pageviews over the year.
  • 20K separate visitors, clearly I get a lot of search traffic.
  • 1.1M feed-hits.
  • 412K feed-article views. Clearly most of my readers are using, er, readers.
  • Busiest day: January 27 (I made reddit)
  • Max feed-subscribers: 450
  • Max feed-hits on a day: 1,295

Top Content, Web

  1. The Linux Boot Process, a chart
  2. LIO-Target on OpenSUSE
  3. Powershell and ODBC
  4. Reverting LVM Snapshots
  5. Sysadmin Best Practices

Top Referring Sites

  1. Reddit (nearly all of it for the boot-process chart)
  2. Google (thank you search)
  3. Stumbleupon (more boot-process chart, but a few others as well)
  4. Planet Sysadmin (an RSS aggregator of a bunch of sysadmin blogs)
  5. ServerFault (some from questions, more from links in the chat-room. Hi guys).

Is this the big-time? Hardly. Making reddit that one time added about 40 subscribers. Leaving WWU lost me about 50 subscribers, which I've since gained back with interest for various reasons.

This does represent growth over 2010, which I'm quite OK with. Blogging for a for-profit company that has secret-sauce to protect and I'm also working on has constrained what I can talk about here, which is why ServerFault content has been more prominent of late. However, 2012 is the year where some of what I'm working on will be let out the door to soar (or crater), so hopefully I'll be able to talk more about that stuff once it's out.

Meta-commentary on sysadminly things in general will continue, though. 

Happy New Year!

Changing student storage habits

| No Comments
I had to do some maintenance on my script that gathers disk-space usage, so the stats database has been on my mind lately. It's been a while since I posted any graphs. This particular graph is a unified chart of the student home-directory volumes over time. I merged the NetWare and Windows volumes into a single space-used chart.

stu-vols-2011.png
This is a very noisy chart.The discontinuities are mostly student-account-purge events that happen once a quarter, but the fall purge is by far the largest.

Note the downward tail at the end! The same chart for staff is a pretty smooth line straight up at a pretty steady slope. This? Clearly usage-habits are changing. I don't know if this is reflected by habitual USB-drive use or if they're using the cloud in some way to store their files, but clearly student-driven storage demand (at least for home-directories) is falling.

One area where it is clearly increasing is the Blackboard Content volume.
bbcontent-2011.png
This data is noisy in that we purge old courses, but we've also changed how many quarters of courses we keep in the system. Looking at this growth chart, it's pretty clear to me that the downtick in student home-directory and class-volume consumption is made up for in increased Blackboard usage. Each quarter more and more professors sign on, other professors increase their usage, and the average size of the files being passed into the system increases.

The ebb and flow of student life

| No Comments
Looking at our bandwidth chart for the last day I can really tell it's finals week.
Student-Ebb-Flow.png
See that trail-off about 1am last night (Sunday)? That normally starts earlier and bottoms out faster when students aren't up doing finals-related things. I know from watching printing-activity reports for overnights that for the first part of finals week our printing activity between 1:30am and 6am is markedly higher than it is at any other time during the quarter. Bandwidth-usage also increases during this time as they take YouTube breaks and whatnot whilst typing madly. By Friday the chart should be a lot flatter as students who don't have end-of-week finals up-root and leave for home mid-week.

Back on printing, our morning peak starts earlier during finals. During most of the quarter usage starts rising about 6:30am and doesn't really get going until 7:30-8:00. This time of the quarter the rise starts at 6am, and is a lot busier, earlier. The steady drumbeat of mid-terms means that there are usually a couple of people pulling all-nighters starting roughly the 3rd week of classes, but finals really focus everyone.

Usage statistics

| 1 Comment
There is a not at all surprising disconnect between what Google analytics reports for this blog and what logfile analysis reports. In light of the FTC's push for an "opt out" button for tracking, I'm guessing the javascript-method of website tracking is going to be less effective.

Operating system:

Google Analytics
Log analysis
Windows
66.7%
58.7%
Linux
23.8%
22.9%
Mac
9.2%
3.7%
Other

14.6%
Interestingly, log analysis also breaks down the OS versions in use. I'm happy to note that the large majority of the Linux users are Suse variants. XP users still outnumber Vista/Win7 users.

Browser:

Google Analytics
Log analysis
Firefox
37.3%
44.3%
Internet Explorer
22.22%
21.9%
Chrome
29.37%
9.8%
Opera
3.17%
6.7%
Safari
4.76%
1.6%
Other/unknown
3.18%
15.7%
The other/unknown is likely the log-analysis engine's inability to figure out some agent strings. At a guess it's really under-reporting all the Chrome users out there. Even so, there are significant differences between the two. To me this looks like Firefox users are much more likely to be using NoScript.

And finally, once browsers start scrambling the User Agent String, even that will be not useful for this kind of tracking.

We go through paper

| No Comments
We're a University, and you'd expect that in this modern era of ipads replacing textbooks and suchlike that our paper costs would be going down. You'd be wrong. We go through a heck of a lot of paper in a quarter. Spring quarter earlier this year generated 1,899,865 pages of printing, which is actually a bit up from what we did last year. Ouch.

For a nice visual clue to what we go through in a day, here is Monday of this week:
Pages-per-hour for September 27, 2010
41,375 pages is the total for Monday. Monday is also our heaviest printing day. That spike you see between 11am and Noon is regular. We've had the 11am printing peak for years. There is a smaller spike between 1 and 2pm. This time of quarter we don't have any printing going on at 5am, though the closer we get to Finals Week the more dark-o-night printing goes on.


Times change, alas

| 2 Comments
Right now we're giving serious consideration to using folder mount-points in Windows in order to solve a specific storage problem. The one thing that make me go, "oh, please, no," is the fact that the disk-space monitoring script I've been using for years, the one that also monitors NetWare, Windows and ESX, can't handle folder-mounts. Why? Because the Windows SNMP agent doesn't give any information about folder-mounts, just drive-letter mounts.

SNMP was very nice since I didn't have to use Windows to get the information I needed. However, Microsoft hasn't been really paying attention to SNMP in recent versions so I am not at all surprised to learn that this hasn't been put in place. Or if it is, they're using a MIB I don't know about.

I suspect I'll have to carve my script up in twain, into Windows and non-Windows variants. That way I can continue to keep data in this particular database (with data that goes back to 2004!).

But still, the core engineering of this guy was done back in 2001, with efforts later on to shim in  Windows and ESX support. I looked into Linux a couple years ago and determined that I could add support for that pretty simply, but never did as we didn't have a call for it yet. 9 years is a long life for a script like this. I suppose it's time.

Or maybe we can not use folder-mounts.

The costs of backup upgrades

| 1 Comment
Our tape library is showing its years, and it's time to start moving the mountain required to get it replaced with something. So this afternoon I spent some quality time with google, a spread-sheet, and some oldish quotes from HP. The question I was trying to answer is what's the optimal mix of backup to tape and backup to disk using HP Data Protector. The results were astounding.

Data Protector licenses backup-to-disk capacity by the amount of space consumed in the B2D directories. You have 15TB parked in your backup-to-disk archives, you pay for 15TB of space.

Data Protector has a few licenses for tape libraries. They have costs for each tape drive over 2, another license for libraries with between 61-250 slots, and another license for unlimited slots. There is no license for fibre-attached libraries like BackupExec and others do.

Data Protector does not license per backed up host, which is theoretically a cost savings.

When all is said and done, DP costs about $1.50 per GB in your backup to disk directories. In our case the price is a bit different since we've sunk some of those costs already, but they're pretty close to a buck fiddy per GB for Data Protector licensing alone. I haven't even gotten to physical storage costs yet, this is just licensing.

Going with an HP tape library (easy for me to spec, which is why I put it into the estimates), we can get an LTO4-based tape library that should meet our storage growth needs for the next 5 years. After adding in the needed DP licenses, the total cost per GB (uncompressed, mind) is on the order of $0.10 per GB. Holy buckets!

Calming down some, taking our current backup volume and apportioning the price of largest tape library I estimated over that backup volume and the price rises to $1.01/GB. Which means that as we grow our storage, the price-per-GB drops as less of the infrastructure is being apportioned to each GB. That's a rather shocking difference in price.

Clearly, HP really really wants you to use their de-duplication features for backup-to-disk. Unfortunately for HP, their de-duplication technology has some serious deficiencies when presented with our environment so we can't use it for our largest backup targets.

But to answer the question I started out with, what kind of mix should we have, the answer is pretty clear. As little backup-to-disk space as we can get away with. The stuff has some real benefits, as it allows us to stage backups to disk and then copy to tape during the day. But for long term storage, tape is by far the more cost-effective storage medium. By far.

Browser usage on tech-blogs

| No Comments
Ars Technica just posted their August browser update. They also included their own browser breakdown. ArsTechnica is a techie site, so it comes as no surprise what so ever that Firefox dominates at 45% of browser-share. This made me think about my own readership.

Browser share piechart for September 09
As you can see, Firefox makes up even more of the browser-share here (50.34%). Interestingly on the low end, Opera is actually the #3 browser (4.46%), not Safari (3.43%). Looking at the version breakdown for those IE users, only 17% of them are on IE6. Yay!

ArsTechnica's Safari numbers are not at all surprising, since they cover a fair amount of Apple news and I don't.

So yeah, Tech blogs and sites don't have a lot of IE traffic. Or, so I believe. What are your numbers?

Printing habits

| 3 Comments
Some students are going to be in for a rude, rude surprise real soon. Today alone there is a student who has printed off 210 pages. Looking at their print history, they printed off 100 copies of two specific handouts (in batches of 50), and that's 40% of their entire quota for the quarter. Once they hit the ceiling, they'll have to pay to get more. This is different from last year!

We always got a few students who rammed their head against the 500 page limit within two weeks of quarter start. I'm sure we'll get some this quarter too. There may be heated tempers at the Helpdesk as a result, but thems the breaks.

Historical data-center

| No Comments
As I've mentioned several times here, our data-center was designed and built in the 1999-2000 time frame. Before then, Technical Services had offices in Bond Hall up on campus. The University decided to move certain departments that had zero to very little direct student contact off of campus as a space-saving measure. Technical Service was one of those departments. As were Administrative Computing Services, Human Resources, Telecom, and Purchasing.

At that time, all of our stuff was in the Bond Hall data-center and switching room. This predates me (December 2003), so I may be wrong on some of this stuff. That's a tiny area, and the opportunity to design a brand new data-center from scratch was a delightful one for those who were here to partake of it.

At the time, our standard server was, if I've got the history right, the HP LH3. Like this:
An HP LH3
This beast is 7U's high. We were in the process of replacing them with HP ML530's, another 7U server, when the data-center move came, but I'm getting a bit ahead of myself. This means that the data-center was planned with 7U servers in mind. Not the 1-4U rack-dense servers that were very common at that time.

Because the 2U flat-panel monitor and keyboard drawers for rack-dense racks were so expensive, we decided to use plain old 15-17" CRTs and keyboard drawers in the racks themselves. These take up 14U. But that's not a problem!

A 42U rack can take 4x of those 7U servers, and one of the 14U monitor/keyboard combinations for a total of...42U! A perfect fit! The Sun side of the house had their own servers, but I don't know anything about those. With four servers per rack, we put in a Belkin 4-port PS-2 KVM switch (USB was still too new fangled in this era, our servers didn't really have USB ports in them yet) in each. As I said, a perfect fit.

And since we could plan our very own room, we planned for expansion! A big room. With lots of power overhead. And a generator. And a hot/cold aisle setup.

Unfortunately... the designers of the room decided to use a bottom-to-top venting strategy for the heat. With roof mounted rack fans.
Rack fans

And... solid back doors.

Rack back doors

We got away with this because only HAD four servers per rack, and those servers were dual processor 1GHz servers. So we only had 8 cores running in the entire rack. This thermal environment worked just fine. I think each rack probably drew no more than 2KW, if that much.

If you know anything about data-center air-flow, you know where our problems showed up when we moved to rack-dense servers in 2004-8 (and a blade rack). We've managed to get some fully vented doors in there to help encourage a more front-to-back airflow. We've also put some air-dams on top of the racks to discourage over-the-top recirculation.

And picked up blanking panels. When we had 4 monster servers per rack we didn't need blanking panels. Now that we're mostly on 1U servers, we really need blanking panels. Plus a cunning use of plexi-glass to provide a clear blanking panel for the CRTs still in the racks.

And now, we have a major, major budget crunch. We had to fight to get the fully perforated doors, and that was back when we had money. Now we don't have money, and still need to improve things. We're not baking servers right now, but temperatures are such that we can't raise the temp in the data-center very much to save on cooling costs. Doing that will require spending some money, and that's very hard right now.

Happily, rack-dense servers and ESX have allowed us to consolidate down to a lot fewer racks, where we can concentrate our good cooling design. Those are hot racks, but at least they aren't baking themselves like they would with the original kit.

Other Blogs

My Other Stuff

Monthly Archives