Tag Archives: Bill Steele

arXiv which helped kickoff the open access movement contemplates its future

arXiv is hosted by Cornell University and lodges over a million scientific papers that are open to access by anyone. Here’s more from a July 22, 2016 news item on phys.org,

As the arXiv repository of scientific papers celebrates its 25th year as one of the scientific community’s most important means of communication, the site’s leadership is looking ahead to ensure it remains indispensable, robust and financially sustainable.

A July 21, 2016 Cornell University news release by Bill Steele, which originated the news item, provides more information about future plans and a brief history of the repository (Note: Links have been removed),

Changes and improvements are in store, many in response to suggestions received in a survey of nearly 37,000 users whose primary requests were for a more robust search engine and better facilities to share supplementary material, such as slides or code, that often accompanies scientific papers.

But even more important is to upgrade the underlying architecture of the system, much of it based on “old code,” said Oya Rieger, associate university librarian for digital scholarship and preservation services, who serves as arXiv’s program director. “We have to create a work plan to ensure that arXiv will serve for another 25 years,” she said. That will require recruiting additional programmers and finding additional sources of funding, she added.

The improvements will not change the site’s essential format or its core mission of free and open dissemination of the latest scientific research, Rieger said.

arXiv was created in 1991 by Paul Ginsparg, professor of physics and information science, when he was working at Los Alamos National Laboratory. It was then common practice for researchers to circulate “pre-prints” of their papers so that colleagues could have the advantage of knowing about their research in advance of publication in scientific journals. Ginsparg launched a service (originally running from a computer under his desk) to make the papers instantly available online.

Ginsparg brought the arXiv with him from Los Alamos when he joined the Cornell faculty in 2001. Since then, it has been managed by Cornell University Library, with Ginsparg as a member of its scientific advisory board.

In 2015, arXiv celebrated its millionth submission and saw 139 million downloads in that year alone.

Nearly 95 percent of respondents to the survey said they were satisfied with arXiv, many saying that rapid access to research results had made a difference in their careers, and applauding it as an advance in open access.

“We were amazed and heartened by the outpouring of responses representing users from a variety of countries, age groups and career stages. Their insight will help us as we refine a compelling and coherent vision for arXiv’s future,” Rieger said. “We’re continuing to explore current and emerging user needs and priorities. We hope to secure funding to revamp the service’s infrastructure and ensure that it will continue to serve as an important scientific venue for facilitating rapid dissemination of papers, which is arXiv’s core goal.”

Though some users suggested new or additional features, a majority of respondents emphasized that the clean, unencumbered nature of the site makes its use easy and efficient. “I sincerely wish academic journals could try to emulate the cleanness, convenience and user-friendly nature of the arXiv, and I hope the future of academic publishing looks more like what we’ve been able to enjoy in the arXiv,” one user wrote.

arXiv is supported by a global collective of nearly 200 libraries in 24 countries, and an ongoing grant from the Simons Foundation. In 2012, the site adopted a new funding model, in which it is collaboratively governed and supported by the research communities and institutions that benefit from it most directly.

Having a bee in my bonnet about overproduced websites (MIT [Massachusetts Institute of Technology], I’m looking at you), I can’t help but applaud this user and, of course, arXiv, “I sincerely wish academic journals could try to emulate the cleanness, convenience and user-friendly nature of the arXiv, and I hope the future of academic publishing looks more like what we’ve been able to enjoy in the arXiv, …”

For anyone interested in arXiv plans, there’s the arXiv Review Strategy here on Cornell University’s Confluence website.

Robo Brain; a new robot learning project

Having covered the RoboEarth project (a European Union funded ‘internet for robots’ first mentioned here in a Feb. 14, 2011 posting [scroll down about 1/4 of the way] and again in a March 12 2013 posting about the project’s cloud engine, Rapyuta and. most recently in a Jan. 14, 2014 posting), an Aug. 25, 2014 Cornell University news release by Bill Steele (also on EurekAlert with some editorial changes) about the US Robo Brain project immediately caught my attention,

Robo Brain – a large-scale computational system that learns from publicly available Internet resources – is currently downloading and processing about 1 billion images, 120,000 YouTube videos, and 100 million how-to documents and appliance manuals. The information is being translated and stored in a robot-friendly format that robots will be able to draw on when they need it.

The news release spells out why and how researchers have created Robo Brain,

To serve as helpers in our homes, offices and factories, robots will need to understand how the world works and how the humans around them behave. Robotics researchers have been teaching them these things one at a time: How to find your keys, pour a drink, put away dishes, and when not to interrupt two people having a conversation.

This will all come in one package with Robo Brain, a giant repository of knowledge collected from the Internet and stored in a robot-friendly format that robots will be able to draw on when they need it. [emphasis mine]

“Our laptops and cell phones have access to all the information we want. If a robot encounters a situation it hasn’t seen before it can query Robo Brain in the cloud,” explained Ashutosh Saxena, assistant professor of computer science.

Saxena and colleagues at Cornell, Stanford and Brown universities and the University of California, Berkeley, started in July to download about one billion images, 120,000 YouTube videos and 100 million how-to documents and appliance manuals, along with all the training they have already given the various robots in their own laboratories. Robo Brain will process images to pick out the objects in them, and by connecting images and video with text, it will learn to recognize objects and how they are used, along with human language and behavior.

Saxena described the project at the 2014 Robotics: Science and Systems Conference, July 12-16 [2014] in Berkeley.

If a robot sees a coffee mug, it can learn from Robo Brain not only that it’s a coffee mug, but also that liquids can be poured into or out of it, that it can be grasped by the handle, and that it must be carried upright when it is full, as opposed to when it is being carried from the dishwasher to the cupboard.

The system employs what computer scientists call “structured deep learning,” where information is stored in many levels of abstraction. An easy chair is a member of the class of chairs, and going up another level, chairs are furniture. Sitting is something you can do on a chair, but a human can also sit on a stool, a bench or the lawn.

A robot’s computer brain stores what it has learned in a form mathematicians call a Markov model, which can be represented graphically as a set of points connected by lines (formally called nodes and edges). The nodes could represent objects, actions or parts of an image, and each one is assigned a probability – how much you can vary it and still be correct. In searching for knowledge, a robot’s brain makes its own chain and looks for one in the knowledge base that matches within those probability limits.

“The Robo Brain will look like a gigantic, branching graph with abilities for multidimensional queries,” said Aditya Jami, a visiting researcher at Cornell who designed the large-scale database for the brain. It might look something like a chart of relationships between Facebook friends but more on the scale of the Milky Way.

Like a human learner, Robo Brain will have teachers, thanks to crowdsourcing. The Robo Brain website will display things the brain has learned, and visitors will be able to make additions and corrections.

The “robot-friendly format” for information in the European project (RoboEarth) meant machine language but if I understand what’s written in the news release correctly, this project incorporates a mix of machine language and natural (human) language.

This is one of the times the funding sources (US National Science Foundation, two of the armed forces, businesses and a couple of not-for-profit agencies) seem particularly interesting (from the news release),

The project is supported by the National Science Foundation, the Office of Naval Research, the Army Research Office, Google, Microsoft, Qualcomm, the Alfred P. Sloan Foundation and the National Robotics Initiative, whose goal is to advance robotics to help make the United States more competitive in the world economy.

For the curious, here’s a link to the Robo Brain and RoboEarth websites.

From Cornell University, a liquid that remembers its shape

Sometimes one experiences a frisson (shiver) when reading about a piece of research. Let’s see how you do with this Dec. 4, 2012 news item on Nanowerk,

A bit reminiscent of the Terminator T-1000, a new material created by Cornell researchers is so soft that it can flow like a liquid and then, strangely, return to its original shape.

Rather than liquid metal, it is a hydrogel, a mesh of organic molecules with many small empty spaces that can absorb water like a sponge. It qualifies as a “metamaterial” with properties not found in nature and may be the first organic metamaterial with mechanical meta-properties.

The Dec. 3, 2012 Cornell University news article by Bill Steele, which originated the news item,goes on to explain the interest in hydrogels and what makes this particular formulation so special,

Hydrogels have already been considered for use in drug delivery — the spaces can be filled with drugs that release slowly as the gel biodegrades — and as frameworks for tissue rebuilding. The ability to form a gel into a desired shape further expands the possibilities. For example, a drug-infused gel could be formed to exactly fit the space inside a wound.

The new hydrogel is made of synthetic DNA. In addition to being the stuff genes are made of, DNA can serve as a building block for self-assembling materials. Single strands of DNA will lock onto other single stands that have complementary coding, like tiny organic Legos. By synthesizing DNA with carefully arranged complementary sections Luo’s [Dan Luo, professor of biological and environmental engineering] research team previously created short stands that link into shapes such as crosses or Y’s, which in turn join at the ends to form meshlike structures to form the first successful all-DNA hydrogel. Trying a new approach, they mixed synthetic DNA with enzymes that cause DNA to self-replicate and to extend itself into long chains, to make a hydrogel without DNA linkages.

“During this process they entangle, and the entanglement produces a 3-D network,” Luo explained. But the result was not what they expected: The hydrogel they made flows like a liquid, but when placed in water returns to the shape of the container in which it was formed.

“This was not by design,” Luo said.

See the material for yourself,

Hydrogels made in the form of the letters D, N and A collapse into a liquid-like state on their own but return to the original shape when surrounded by water Provided/Luo Lab

Nature Nanotechnology published the team’s research online Dec. 2, 2012 and, unusually, the article is open access (at least for now),

A mechanical metamaterial made from a DNA hydrogel by Jong Bum Lee, Songming Peng, Dayong Yang,  Young Hoon Roh, Hisakage Funabashi, Nokyoung Park, Edward J. Rice, Liwei Chen, Rong Long, Mingming Wu & Dan Luo in Nature Nanotechnology  (2012) doi:10.1038/nnano.2012.211 published online Dec. 2, 2012

Depending on your reading interests and time available, Bill Steele’s Cornell University article has more detail than I’ve provided here or you can check out the well illustrated article in Nature Nanotechnology. As these things go, it’s quite readable as you can see with the abstract (Note: I have removed footnotes),

Metamaterials are artificial substances that are structurally engineered to have properties not typically found in nature. To date, almost all metamaterials have been made from inorganic materials such as silicon and copper, which have unusual electromagnetic or acoustic properties that allow them to be used, for example, as invisible cloaks superlenses or super absorbers for sound. Here, we show that metamaterials with unusual mechanical properties can be prepared using DNA as a building block. We used a polymerase enzyme to elongate DNA chains and weave them non-covalently into a hydrogel. The resulting material, which we term a meta-hydrogel, has liquid-like properties when taken out of water and solid-like properties when in water. Moreover, upon the addition of water, and after complete deformation, the hydrogel can be made to return to its original shape. The meta-hydrogel has a hierarchical internal structure and, as an example of its potential applications, we use it to create an electric circuit that uses water as a switch.

For anyone not familiar with the Terminator movies, here’s an essay in Wikipedia about the ‘franchise’. Pay special note to the second movie in the series, Terminator 2: Judgment Day which introduced a robot (played by Robert Patrick) that could morph from a liquidlike state into various lethal entities.

Princeton goes Open Access; arXiv is 10 years old

Open access to science research papers seems only right given that most Canadian research is publicly funded. (As I understand it most research worldwide is publicly funded.)

This week, Princeton University declared that their researchers’ work would be mostly open access (from the Sept. 28, 2011 news item on physrog.com),

Prestigious US academic institution Princeton University has banned researchers from giving the copyright of scholarly articles to journal publishers, except in certain cases where a waiver may be granted.

Here’s a little more from Sunanda Creagh’s (based in Australia) Sept.28, 2011 posting on The Conversation blog,

The new rule is part of an Open Access policy aimed at broadening the reach of their scholarly work and encouraging publishers to adjust standard contracts that commonly require exclusive copyright as a condition of publication.

Universities pay millions of dollars a year for academic journal subscriptions. People without subscriptions, which can cost up to $25,000 a year for some journals or hundreds of dollars for a single issue, are often prevented from reading taxpayer funded research. Individual articles are also commonly locked behind pay walls.

Researchers and peer reviewers are not paid for their work but academic publishers have said such a business model is required to maintain quality.

This Sept. 29, 2011 article by James Chang for the Princetonian adds a few more details,

“In the interest of better disseminating the fruits of our scholarship to the world, we did not want to put it artificially behind a pay wall where much of the world won’t have access to it,” committee chair and computer science professor Andrew Appel ’81 said.

The policy passed the Faculty Advisory Committee on Policy with a unanimous vote, and the proposal was approved on Sept. 19 by the general faculty without any changes.

A major challenge for the committee, which included faculty members in both the sciences and humanities, was designing a policy that could comprehensively address the different cultures of publication found across different disciplines.

While science journals have generally adopted open-access into their business models, humanities publishers have not. In the committee, there was an initial worry that bypassing the scholarly peer-review process that journals facilitate, particularly in the humanities, could hurt the scholarly industry.

At the end, however, the committee said they felt that granting the University non-exclusive rights would not harm the publishing system and would, in fact, give the University leverage in contract negotiations.

That last comment about contract negotiations is quite interesting as it brings to mind the California boycott of the Nature journals last year when Nature made a bold attempt to raise subscription fees substantively (400%) after having given the University of California special deals for years (my June 15, 2010 posting).

Creagh’s posting features some responses from Australian academics such as Simon Marginson,

Having prestigious universities such as Princeton and Harvard fly the open access flag represented a step forward, said open access advocate Professor Simon Marginson from the University of Melbourne’s Centre for the Study of Higher Education.

“The achievement of free knowledge flows, and installation of open access publishing on the web as the primary form of publishing rather than oligopolistic journal publishing subject to price barriers, now depends on whether this movement spreads further among the peak research and scholarly institutions,” he said.

“Essentially, this approach – if it becomes general – normalises an open access regime and offers authors the option of opting out of that regime. This is a large improvement on the present position whereby copyright restrictions and price barriers are normal and authors have to attempt to opt in to open access publishing, or risk prosecution by posting their work in breach of copyright.”

“The only interests that lose out under the Princeton proposal are the big journal publishers. Everyone else gains.”

Whether you view Princeton’s action as a negotiating ploy and/or a high minded attempt to give freer access to publicly funded research,  this certainly puts pressure on the business models that scholarly publishers follow.

arXiv, celebrating its 10th anniversary this year, is another open access initiative although it didn’t start that way. From the Sept. 28, 2011 news item on physorg.com,

“I’ve heard a lot about how democratic the arXiv is,” Ginsparg [Paul Ginsparg, professor of physics and information science] said Sept. 23 in a talk commemorating the anniversary. People have, for example, praised the fact that the arXiv makes scientific papers easily available to scientists in developing countries where subscriptions to journals are not always affordable. “But what I was trying to do was set up a system that eliminated the hierarchy in my field,” he said. As a physicist at Los Alamos National Laboratory, “I was receiving preprints long before graduate students further down the food chain,” Ginsparg said. “When we have success we like to think it was because we worked harder, not just because we happened to have access.”

Bill Steele’s Sept. 27, 2011 article for Cornell Univesity’s ChronicleOnline notes,

One of the surprises, Ginsparg said, is that electronic publishing has not transformed the seemingly irrational scholarly publishing system in which researchers give their work to publishing houses from which their academic institutions buy it back by subscribing to journals. Scholarly publishing is still in transition, Ginsparg said, due to questions about how to fund electronic publication and how to maintain quality control. The arXiv has no peer-review process, although it does restrict submissions to those with scientific credentials.

But the lines of communication are definitely blurring. Ginsparg reported that a recent paper posted on the arXiv by Alexander Gaeta, Cornell professor of applied and engineering physics, was picked up by bloggers and spread out from there. The paper is to be published in the journal Nature and is still under a press embargo, but an article about it has appeared in the journal Science.

Interesting, eh? It seems that scholarly publishing need not disappear but there’s no question its business models are changing.