Saturday, June 2, 2012

A review of the book Too Big to Know

I know that there are already tons of reviews and information out there about David Weinberger's newish book, Too Big to Know, but I like writing these reviews so that I can remember what it was that struck me about the 231 page book, and the implications for academic libraries and librarians.

In the past, to reduce the flood of information, there was "an elaborate system of editorial filters that have prevented most of what's written from being published." He also noted that "Knowledge has been about reducing what we need to know." (page 4)  Today, rather than limiting knowledge "to what fits in a library or a scientific journal, we are now knowing-by-including every draft of every idea in a vast, loosely connected webs" of information. (page 5)  Thus, we are beginning to filter our feeds of information post-publication instead of pre-publication.  "Filters no longer filter out.  They filter forward, bringing their results to the front." (page 11) 

"We now know that there's too much for us to know."  This has consequences for old institutions because "the task is just too large." (page 11)  We need to create new technologies that can handle the filtering and text mining of huge amounts of data and information.

Some of the book talks about the work of Jack Hidary.  He left a career as a scientist at NIH because he felt that "putting scientific papers through the traditional peer-review process had begun to seem frustratingly outdated." (page 15)

In the past, one could trust the validity of data that was "professionally published and stocked in your library" because it was published by a reputable publisher [however one makes that definition.] For example, if you didn't trust the population figure of Pittsburgh in an almanac, you could go out and hire your own census takers to make your own count of the population, but that is not feasible.  Thus, the almanac was a stopping point, and we trust that the data is accurate enough for our needs. (page 21)  But, if you are looking for medical data, or scientific data or business data that has serious consequences if a source is wrong, one might check multiple sources or duplicate the research process to create a new data point that may or may not agree with the original published facts. He discusses the work of Jean-Claude Bradley and his UsefulChem project on pages 139-141.

On page 95, he brings up Robert Darnton, and how he would like to see books that allow readers to look at other books in their totality, and not just single lines or passages.  This could "open up new ways of making sense of the [historical] evidence."

In the past, the shape of knowledge was a triangle, with authorities at the top, and knowledge passed down to lower levels as needed.  Now, "knowledge on the Net has no shape because the Net has no outer edge."  "The shapelessness of knowledge reflects its reinvigoration, but at the cost of removing the central points of authority around which business, culture, science, and government used to pivot." (page 110) Authorities can be anywhere and come from anywhere.  The authorities of knowledge today do not have to live within the old publication and journal article system.

We are not going to resolve the question over whether the internet is good or bad for knowledge.  "That is too intertwingly." (page 117)  "We can learn how to use the Net to help us know and understand our world better" and we should teach our children how to search and learn from the Net.  That is a great job for librarians and parents.

"If books taught us that knowledge is a long walk from A to Z, the networking of knowledge may be teaching us that the world itself is more like a shapeless, intertwingled, unmasterable web" of information. (page 119)

Chapter 7 is "Too Much Science."  Here, he covers the huge amounts of data that are being faced by today's scientists.  Some of the projects mentioned are:
"The impact factor today reflects what was important two to three years ago." (page 137)

"Mendeley is being felt outside of the population of Mendeley users because it can give a much faster view of what papers are mattering to scientists than can the impact factor." (page 138)

Peter Binfield (formerly of PLoS ONE) noted that "Scientific journals rarely publish research with negative results." And, this is a problem because it would be useful to scientists to know what other people have tried and failed at doing. (page 139)  He then goes on to discuss Jean-Claude Bradley's work from pages 139-141.  More of the policies of PLoS ONE and Peter Suber and Open Access are discussed on pages 141-143.

After a good discussion of some past projects of John Wilbanks, he noted that it is more important "that we be able to share data than that we agree on exactly how that data should be categorized, organized and named.  We have given up on the idea that there is a single, knowable organization of the universe." (page 148)

"If electronic media were hazardous to intelligence, the quality of science would be plummeting," but that is not the case. (page 150, from another source by Richard Smith) Science is taking advantage of electronic media to work smarter, better and faster.

"Call the decision not to track down the hardcopy in a library laziness if you want--and there are times when it will have bad consequences--but it feels like efficiency." (page 175) Yup, if it isn't online and easily downloadable, most people will not jump through hoops, nor pay $35 to download the article.  They will take the shortest path to get to the information they think they need.

This is good advice to publishers--"don't try to reduce the network's inherent abundance by introducing artificial scarcities, such as imposing on digital libraries all the limitations on access inherent in physical libraries." (page 183) Yet, so many ebook services try to limit access or have a DRM that limits the functionality of an ebook in library settings.  But, these publishers are just making it frustrating for readers of their content.  Then, from pages 183-185, he covers the advantages of opening up access to journals, and creative commons licenses, again.

"Libraries not only have content in books and articles, they have the expertise of librarians, they have metadata about usage patterns that can be used to guide researchers, and they are at the center of communities of scholars who are the most learned people in their fields."  (page 191)

"If we want the Net to move knowledge forward, then we need to educate our children from the earliest possible age about how to use the Net, how to evaluate knowledge claims, and how to love difference....  But, knowing how to click buttons is the least of our concerns." (page 192)  Trying to teach undergraduates about how to evaluate information is really tough. Unfortunately, I probably spend too much time showing students how to use our database and how to click around the interface, and not as much time on figuring out how good one article is compared to another, or comparing one journal with another.  He noted that "learning how to evaluate knowledge claims--is never ending." (page 192)

Overall, I loved the book.  Run out and buy a copy, or check it out from your library.

No comments: