Robots and Your Download Statistics

When I worked in acquisitions, download statistics were quite useful.  They let us know that our massive digital collections were being used.  They are one of the most useful tools acquisitions can provide bibliographers to help bibliographers choose what subscriptions to keep, and which to drop.  They also gave us a glimpse into how people were using the digital collections, and how they were accessing them.  For example, in a pilot program, we wanted to see if evidence-based acquisitions (EBA) or demand-driven acquisitions (DDA) were sustainable and useful to our library.  The download statistics were useful, because that was how we knew what we purchased (or would purchase) on those plans.  We could also see how users were discovering our offerings; some of the vendors are able to give data on how users were referred to a source (via Google Scholar, via our homegrown federated search, via the catalog, etc.)  Download stats are cool.

The University College Dublin Library (Leabharlann UCD) hosted guest blogger Joseph Greene (Research Repository UCD), who asked, “How accurate are our download statistics?”  I know how valuable the download statistics were in acquisitions, so I wanted to know if they were actually accurate.

Greene focuses on UCD Library’s institutional repository – rather than a vendor platform like, let’s say, JSTOR – but he provides interesting insight into where your stats might actually be coming from.


Many organizations use bots to crawl the web.  Think, for instance, Google.  Google uses bots to trawl the internet, which allows them to index information and make it searchable.  The Internet Archive does similar, or link checkers.  So do scammers, phishers, etc.  From what Greene says, it can be hard to distinguish actual human users from robots.

Greene and his colleagues found that 85% of their downloads were from robots.  (Wow.)  UCD Library was able to distinguish most of robots from human users.  Greene and his colleagues will be presenting at Open Repositories 2016 in Dublin, Ireland to present on further findings.

Although I’m unfortunately stuck in the United States until my institutional funding kicks in, it would be cool to go 1) to Ireland and 2) to find out how DSpace and EPrints filter out robots in their statistics.  (I wonder how WordPress or JSTOR or ProjectMUSE or any other platform does too.)


Gender in Library IT

In mid-May, I graduated from my library science program.  With my graduation, I had to leave my graduate assistantship in library acquisitions.  I am one of the fortunate graduates who have a job all lined up; I’ll start in July of the 2016-2017 fiscal year (just a few days away).  I am going to be a systems librarian.  Systems librarianship definitely falls under technical services (my seeming forte at this point in my life), but it also falls under library IT.

Knowing that I’ll be working in library IT (and with campus IT), I was interested when I saw that LITA (Library Information Technology Association) Blog had posted about gender in library IT.  “Let’s look at gender in library IT” sums up what you probably already know, but it also points out some interesting things that I wouldn’t have guessed.

So, let’s look at some highlights from the post:

  • ALA is 87% white, 81% female (as of 2015 data), but heads of IT positions are predominantly held by males.
  • In library IT, men outnumber women 2.5: 1.
  • Women author 65% of articles in non-technology library journals, but only 35% of articles in library IT journals.
  • A lot of IT positions in libraries aren’t labeled as librarian positions.
  • Library IT folks get paid more than their non-IT counterparts.
  • But women and men working in library IT get paid, for the most part, equally. (YES!)

For more, go check out the blog itself.

Different Generations, Different Information Consumption

Fractl and BuzzStream authored a study (“The Generational Content Gap”) which surveyed approximately 1,200 people on how they consumed information and then broke down how different generations differ in their content consumption.

“The Generational Content Gap” finds that Baby Boomers spend the most time on their screens – about 20+ hours per week looking at online content.  (This is not surprising, since my mother is a late Boomer and spends 20+ hours online.)  These Boomers are usually online between 9 a.m and noon.  GenXers and Millenials spend a measly 5-10 hours consuming online content; and they’re most active after 8 p.m.

The study also finds that all three generations love blogs – and articles of 300-words in length – and images.  They all disliked slide shows, white papers (no duh), and webinars.

Millenials are most likely to use their smartphones to access the internet, whereas GenXers and Boomers will use laptops and desktops.  Use of tablets are low for all; only about 10% of Boomers, GenXers and Millenials use a tablet to access the internet.

It’s an interesting study, and you can read more and view a cool infographic over at Contently’s blog.

boomer and millenial on phones
“My mother and I on our respective devices,” Rebecca Ciota, 2016.

Classifiers and Hoarders

Azariah Root
Azariah Smith Root. Public Domain.

There seem to be two sorts of librarians in the world – the pack rats and the labelers.  (Or at least, I like to joke that that is the case.)  One wants to save everything; the other wants to classify everything into a neat order.  I always explain my undergraduate institution’s excellent library by noting that one of its early directors was a bonafide hoarder.  Azariah Smith Root would put just about anything paper in Oberlin College’s library; and once it was in the library, it wasn’t leaving.  And the perhaps most famous librarian in the United States, Melvil Dewey, was the other type of librarian – a classifier.  He wanted everything in the world to fit under his (highly Eurocentric) labels.

I’m starting to get the feeling that I am probably a labeler, a classifier, one of those organized types.

Because I graduated from my Masters program and am now pursuing new opportunities, I’m moving out of my long-term home and into a new residence.  Moving has forced me to evaluate all the stuff I’ve accumulated and evaluate whether I want to keep it.  The pack rats among us will be sad to learn that I happily have been weeding my collections (books, trinkets, etc.) with a gusto that surprises me.  I’m probably only taking half of my books; the others go to charity or are disposed of.  Many of my hand-drawn maps have gone to the recycling bin.  Photos have been torn up and thrown away.  For a person who wants to protect information (and has a soft spot for rare books and archives), I am shamefully happy to throw things away.

So, yeah, I’m probably not an Azariah Root.

Melvil Dewey
Melvil Dewey. Public Domain.

Which leaves me with the unfortunate feeling that I might be a Dewey.  In my domestic space, everything has a spot and it always goes in that spot.  Well, now that I’m pulling everything off my shelves and out of my drawers, transitioning them to boxes, nothing is in its place.  First, I can’t find anything; and second, I hate staring at the piles of unorganized material hogging my floor space.  I’ve come to realize that I want everything in its “proper” place, and not scattered to the four winds.

(I guess maybe it’s good I’m not a cataloger, because I think a backlog might drive me insane.  All those books, piled up and unorganized – ack!)

So, though my residence looks like I’m a pack rat, it is emotionally proving to me that I’m a classifier.  Maybe I went into librarianship because I like the order and organization of information – at least in part.

(Also, moving is a crazy time.)