Public Domain Databases in Chemistry

by Oliver Seely, Dept. of Chemistry, CSU Dominguez Hills

PUBLIC DOMAIN AND FAIR USE

Written works in the public domain may be used without having to obtain anyone's permission. One may copy them, publish them for profit, put them on Web pages, purchase billboard space for them and buy commercial TV time for them all without having to worry about being hauled into court for having infringed on someone's intellectual property rights. Public domain materials are the common property of us all.

"Fair use" of copyrighted material may be made without permission if such use is for "criticism, comment, news reporting, teaching, scholarship and research." Fair Use is described in Section 107 of the U.S. Copyright Act(1), but in a document concerning the fair use doctrine(2), the U.S. Copyright Office treats would-be fair-users of copyrighted material to a chilling warning: "The distinction between "fair use" and infringement may be unclear and not easily defined. There is no specific number of words, lines, or notes that may be safely taken without permission. Acknowledging the source of the copyrighted material does not substitute for obtaining permission." Although it is much easier to find numerous treatments of the fair-use doctrine(3)(4)(5)(6) than those treating the subject of public domain, these documents soon immerse the reader in depressingly murky uncertainty about what may be fairly used. Most unfortunately, this uncertainty extends beyond that of fair use. The most timid soul among us, in deciding to use only public domain materials for his class is awakened to a rude shock: Finding definitive statements and examples about which works exist without any dispute in the public domain is surprisingly difficult. Question the average academician about bodies of knowledge other than the alphabet and expect some ambivalence in the answer. Although there has been a U.S. Copyright Register for 100 years, there is at this time no "public domain register." The "Rule of Thumb" published by Project Gutenberg(7) is a rare offering for determining when otherwise copyrightable works enter the public domain.(8) Project Gutenberg announced in 1994 that it would offer an index of works in the public domain in 1997 to coincide with the 100thanniversary of the U.S. Copyright Register, but with recent impending changes in the U.S. Copyright Law that project has been put on hold. In view of the uncertainty of the meaning of fair use in the age of electronic media, attempts are being made now vigorously to stake out territory which establishes liability and jurisdiction of intellectual property.(9) It is not difficult to find claims of wide boundaries of copyright protection by offering abundant examples of situations claimed to be infringement(10) and by the use of a legal nuance poorly understood by non-students of copyright law. It seems that the presentation of the score of a Mozart symphony, the font, the size of the staves and the general appearance of the notes on the page is copyrightable even if the portrayal of how the piece sounds (the notes, the harmony, the dynamics and the designation of which instrument plays which staff -- that is, the master's music) is not. Likewise, most of the presentations of the CRC Handbook of Chemistry and Physics(11) would likely fall into the same category; that is, one cannot legally make and distribute unlimited numbers of photocopies of the Handbook presentation of ideas and discoveries which clearly belong to the public domain because such photocopies would be violations of the copyright of the presentation of those ideas. Still, that there is no requirement for the owner of a copyright to state up front just what it is that he is copyrighting muddies the water considerably for all of us interested in treating information more as that of knowledge than that of commodity. One doesn't often find a statement like the following in a copyright notice: "The reprinting of any portion of the Special Helps or References in this Bible without the publisher's permission is forbidden."(12) That is to say, we have copyrighted our creative contribution to this publication -- the annotations -- but you are free to make photocopies of the Biblical text without fear of copyright infringement. How enlightened it would be of the CRC if in the same spirit its copyright notice stated, "Direct photocopies of the pages of this Handbook are an infringement of the U.S. Copyright of this presentation, except as allowed by the Fair Use Doctrine; moreover, there would be no infringement if you were to scan these data, digitize them and reproduce them, even in a format identical to our own."(13) The requirement for such a disclaimer by some future copyright act would of course go a long way toward clarifying this issue, but such clarification does not work in the interest of those who would publish for profit.

Although Article I, Section 8 of the U.S. Constitution states that "The Congress shall have power. . .To promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries," and although the section dealing with copyright is viewed on the one hand to give for limited periods exclusive rights over those works so as to encourage future creative activity but on the other to be limited when it seems to conflict with the overriding public interest of encouraging further development and is not to promote the author's personal interests or to increase the author's personal wealth,(14) and even though we are assured by the Copyright Office after the 1976 revision of the Copyright Act that the Act's primary purpose is to promote the creation and dissemination of knowledge and ideas,(15) the lack of broad definitions of what constitutes fair use and the abundance of examples of court cases laying out examples of infringement fairly inhibits the educational use of all written works. The Copyright Office does advise that among those materials which cannot be copyrighted, that is those which lie in the public domain, are ideas, concepts, discoveries (often identified as facts), methods, principles, procedures, processes, systems, theorems, work in which the copyright has expired, or work that the legal owner has placed in the public domain, but in the same document the reader is admonished that one cannot be assured of public domain status if (1) there is no copyright notice, (2) that the work is distributed at no charge, (3) that the work is found in a public place(16). Moreover, one is warned that unless one is certain that a work is in the public domain it is best to assume that it is copyrighted. No wonder then that educators occupy the spectrum between those who copy nothing for their classes (an endangered species at best) through the middle ground of those who follow the philosophy "When in doubt, copy," to those who copy everything, relying on the broadest interpretation of fair use and hoping that they won't be the defendants in an infringement lawsuit. In any case, the establishment by the U.S. Copyright Office of the categories not subject to copyright gives us a starting point of sorts.

WHAT CAN WE USE WITHOUT ASKING ANYONE'S PERMISSION?

Taking those things not subject to copyright in the sequence given by the U.S. Copyright Office,

(1) Idea.

The idea that we live in an expanding universe was first proposed, with supporting data, by Edwin Hubble in 1929.

This idea is not subject to copyright although one might find it in numerous copyrighted texts on the subject; moreover, this original expression of that idea may be copyrighted as could be any original and alternative expression of the same idea.

(2) Concept.

In 1953 Stanley Miller demonstrated the formation of amino acids with the input of electrical and light energy in a chamber containing what was thought at the time to be simple constituents of the earth's ancient oceans. This experiment supported the so-called heterotroph hypothesis of Alexander Oparin and the concept of the spontaneous generation of life on the early earth.

The concept of the spontaneous generation of life is not subject to copyright although one can find many copyrighted works in which it appears, and this new creative summary could be copyrighted.

(3) Discovery (or fact).
Name Synonyms and

Formulae

Molecular

Weight

Crystalline

form,

Properties,

Index of

Refraction

Density or

Specific

gravity

M.P.

(C)

B.P.

(C)

Solubility,

cold water

i, ss, s, vs

Solubility, hot water

i, ss, s, vs

Other

Solvents

Actinium Ac 227 silvery white metal, cubic 1050 3200±300 decomp. to Ac(OH)3
Actinium bromide AcBr3 466.73 White, hexagonal 5.85 Subl. 800 s
Actinium trichloride AcCl3 333.36 White crystals, hexagonal 4.81 Subl 960



Most of the information shown above is not subject to copyright because it represents a collection of discoveries. The headings and their sequence are identical to that used in the presentation offered by the CRC.(17) It is our opinion as was a majority of the justices of the U.S. Supreme Court in an unrelated case but one in which there are surprising similarities,(18) that the arrangement of these facts is "not original in any way." That is, one may not copyright a format. We leave it to the readers acting in concert to complete the table for our common use and that of our students.

(4) Method.

A primary standard may be weighed and dissolved in water to produce a measured volume of solution of known concentration. This is called the direct method for preparing a standardized solution.

This method is not subject to copyright, though the source from which it was taken is a copyrighted collection of related works. This unique and stunningly original interpretation of the method could be copyrighted.

(5) Principle.

The Cosmological Principle states that the universe is homogeneous, that there is no preferred position of observation and that it is isotropic, that it ought to appear the same regardless of the direction of the view.

This principle cannot be copyrighted, even though the document from which it was taken is copyrighted and the remarkably lucid and original description above may be copyrighted.

(6) Procedure.

Cleaning solution for laboratory glassware may be prepared by pouring slowly while stirring a sufficient amount of concentrated sulfuric acid into 10 mL saturated aqueous sodium dichromate to produce 250mL total volume..

This procedure is not subject to copyright though it was taken from a copyrighted source and this new description of that procedure could well be copyrighted, though as is the case for every other decision to copyright some work, the creative effort involved becomes an issue. It would be a bit of a stretch to claim that the description above represents a substantially original variation on a well-known and time-honored laboratory procedure.

(7) Process.

A sequence of DNA can be reproduced millions of times by using two initiating oligonucleotides each of which anneals to a complementary strand on either side of the sequence to be amplified. The initiators and mononucleotides may be introduced at a molar concentration 9-12 orders of magnitude greater than the desired sequence. In the presence of an appropriate heat-resistant polymerase and over the course of 20-40 heating and cooling cycles, DNA fragments containing the desired sequence will be reproduced at identical lengths with each initiator offering a termination point at it's 5' end for each complementary strand.

A process such as the polymerase chain reaction is not subject to copyright, although the process is patented and using the process for profit would likely constitute a violation of patent rights.

(8) System.

The Emergency Alert System is devised to provide the President of the U.S. a means to communicate with the public in the event of a national emergency. It is designed to follow standard EAS protocol listed in 11.31 of the EAS rules. EAS began to supplement the Emergency Broadcast System on January 1, 1997 and replaced it on January 1, 1998. It is called the Emergency Alert System because it includes communication media other than broadcast.

The system described above is not subject to copyright, though the text of this thoughtful description may be.

(9) Theorem.

Four colors are sufficient to color a map so that no two adjacent regions require using the same color.

The Four Color Theorem is not subject to copyright.

(10) Work in which the copyright has expired.

Today we come to a kind of attraction even more curious than the last, namely, the attraction which we find to be of a double nature -- of a curious and dual nature. And I want first of all to make the nature of this doubleness clear to you. Bodies are sometimes endowed with a wonderful attraction, which is not found in them in their ordinary state. For instance, here is a piece of shellac, having the attraction of gravitation, having the attraction of cohesion; and if I set fire to it, it would have the attraction of chemical affinity to the oxygen in the atmosphere. Now, all these powers we find in it as if they were parts of its substance; but there is another property which I will try and make evident by means of this balloon. There is no attraction between this balloon and this shellac at present: there may be a little wind in the room slightly moving the balloon about, but there is no attraction. But if I rub the shellac with a piece of flannel, look at the attraction which has arisen out of the shellac, simply by this friction, and which I may take away as easily by drawing it gently through my hand.(19)

On the page giving the publication history of one of the later editions of this work there is the statement: Copyright © 1960 by the Viking Press, Inc. All rights reserved. The author of this work died in 1867; this description first appeared in published form in 1860. In all fairness, the Viking Press has brought to us Michael Faraday's six splendid Christmas Lectures under the title, "On the Various Forces of Nature." Still, under U.S. Copyright Law, the work is clearly within the public domain and the Copyright statement gives no hint of that to the reader. Any of us could make an electronic version of that work and send it to the world without the slightest worry of copyright infringement.

(11) Work that the legal owner has placed in the public domain.

Oliver's Professonal Web Site

contains a modest offering of databases in the public domain and material which has been placed in the public domain by the author.

An author may place a work in the public domain. We are told by the U.S. Copyright Office that once a work is in the public domain, it may not be copyrighted, though there is a caveat to that advice, namely that every time a substantially new edition is created, especially if it is a new translation or done by a new editor, a new work is created, so you count from the creation of that edition, not from the creation of the original. Even though the bulk of information in a newly copyrighted work may be in the public domain, the copyright owner is under no obligation to state that; in fact the work is copyrighted as a whole, even though the U.S. Supreme Court holds that there are parts within the whole that are in the public domain.(20) On the other hand, a new editor has the right to place the work in the public domain, as one of us has done with the works offered in the link given above.

What might fairly offer examples of collections of data in chemistry lying in the public domain? Here are just a few:

Physical properties of inorganic and organic compounds

NBS Tables of Thermodynamic Properties

Atomic positions within crystal structures, space groups, density

Standard laboratory procedures

Tables of Solubility

Densities of Solutions

Mathematical Tables

Electronic configuration of the elements

Properties of the Isotopes of the Elements

Reduction Potentials

The Periodic Table

Rules of Nomenclature

Gravimetric factors

Tables of the composition and decay of isotopes

Biochemical Reference Data

Physical properties of commercial plastics

Heats of formation, heats of combustion

Acid and Base dissociation constants

THE CHALLENGE

One motivation we have in this paper is to urge our colleagues to get on with the task before us. Offer your imaginative teaching strategies to your students and certainly protect that creative effort through the process of copyright, but when you build databases of chemical knowledge clearly in the public domain, consider offering them for all of us to share and to augment.

One curious recent development in the area of on-line databases is the use of presentations in which the data themselves cannot be manipulated save by means of proprietary print and search engines, such as in the form of Postscript or Adobe Acrobat files. The images of the data can be printed and search services can be offered on those files but any direct manipulation of the data or downloading the files for subsequent manipulation is not allowed. To be universally useful to users of the World Wide Web, a database must offer the capability of manipulation. That is, the user ought to be able to download it, import it into a wordprocessor or spreadsheet program, or to write a program to do some specialized manipulation of the data not otherwise offered by an off-the-shelf or Web application.

What to do? (1) Think of a database you feel needs to be offered; (2) Search for its existence already on the Web through available search servers; (3) if it isn't there, announce to appropriate discussion groups (CHEMED-L, for example) that you plan to put together such a database and ask for volunteers to help you; (4) get on with the task; (5) make all of the data downloadable. You can't predict how someone else might want to use the data; it is best to offer them in a form which can be manipulated by the user. (6) Format: anything you want as long as the delimiters are non-overlapping with the data. Place keywords(21) near the top of your database to be picked up by Web search engines; (7) include a "hit" counter near the top of the database.

Regarding the last point, the counter offers an indication of this new ease of information transfer. As professional educators we now find it possible to share where sharing was virtually impossible before. The same ambience of the research lab where everyone is motivated by an objective of common interest can be experienced via the new avenue of communication. The counter connected to every database gives the person or group who prepared the presentation a measure of the common interest in that information. Such common interest can be tapped via collaboration at a distance for further development of the collection of data.

To the criticism that making such data freely available will diminish credibility and high standards we would offer that over the long run such a new approach will be self correcting, as are the for-profit systems in place now. The more tables of data are used the higher is the probability that someone will spot an error and report it. If public domain databases can be downloaded and made available on numerous servers or used privately then it stands to reason that people will WANT to organize reporting groups to disseminate information on discovered errors.

THE ELEVENTH CATEGORY

Consider the following copyright statement:

Copyright ©1998 by Oliver Seely. This work may be copied without limit if its use is to be for non-profit educational purposes. Such copies may be by any method, present or future. The author requests only that this statement accompany all such copies. All rights to publication for profit are retained by the author.

What is important about this statement from the point of view of the educator?

(1) It states up front that the authors will not claim infringement for educational use of any kind in unlimited quantities. All uncertainty about its fair-use status for educational use is removed. As regards its use as a classroom resource, it is as good as if it were in the public domain. (2) It is written in the spirit of the ownership of intellectual property: "I own rights to this creative work and I want to share it freely my colleagues in education." Moreover, it is written in the spirit of the constitutional objective "to promote the progress of science and useful arts,. . .". Finally, it is written in the spirit of the 1976 U.S. Copyright Act which tells us that the Act's "primary purpose is to promote the creation and dissemination of knowledge and ideas." (3) It contains the seeds of some fundamental change in international attitudes about copyright because on the World Wide Web a single person who writes thoughtfully and lucidly within a subject can make his/her creativity freely available to all. (4) The breadth and style of that creative effort can go far beyond (video/audio links, interactive exercises) that which can be offered within a single package of other more traditional media. (5) The author has complete artistic freedom. Many textbooks today contain far more chapters than can possibly be covered in the allotted time, partly because many people get into the act: editors, marketers, advertisers and reviewers. An author of a Web textbook can decide what he/she wants to offer and go for it. (6) Putting material on the Web for our students is not much different than that which most of us do in our classrooms. Making things available to everyone on the Web requires but a minor change in procedure. (7) Many authors of even successful textbooks will tell you that when all accounts were settled they earned very low hourly wages in the work they expended on the project. For a forfeit of that low hourly wage one can offer on the Web a more intellectually satisfying product to the world and be free from telephone calls from an editor asking for major or minor changes or reminding one of an impending deadline.

CONCLUSION

That today a single new work becomes available to everyone with a connection to the Internet at minimal added cost in time and effort to the original author means that the major future effort in producing new works can be the creative process itself. Whereas in the case of more traditional media a process involving typesetting, printing, marketing and advertising was a necessary partner in giving creative works some limited exposure, today using the world wide web those other tasks are largely eliminated and the exposure is limited only by the number of people connected to the web.

1. Copyright Act. U.S.C. 101 to 810 (1976 & Cum. Supp. 1984).

2. http://lcweb.loc.gov/copyright/fls/fl10

3. http://www.cetus.org/fairindex.html

4. Fair Use Guidelines, Copyright Information Services, P.O. Box 1460, Friday Harbor, WA 98250

5. Reproduction of copyrighted works by educators and librarians, Circular 21, Copyright Office, Library of Congress, [1992], Washington, D.C.

6. Helm, Virginia M., What Educators Should Know about Copyright. The Phi Delta Kappa Educational Foundation, Bloomington, Indiana, 1986.

7. http://www.promo.net/pg/history.html

8. Rules of thumb for determining when a work enters the public domain in the United States

1. Works first published before January 1, 1978 usually enter the public domain 75 years from the date copyright was first secured, which is usually 75 years from the date of first publication. (This is the rule Project Gutenberg uses most often)

2. Works first created on or after January 1, 1978 enter the public domain 50 years after the death of the author if the author is a natural person. (Nothing will enter the public domain under this rule until at least January 1, 2023.)

3. Works first created on or after January 1, 1978 which are created by a corporate author enter the public domain 75 years after publication or 100 years after creation whichever occurs first. (Nothing will enter the public domain under this rule until at least January 1, 2053.)

4. Works created before January 1, 1978 but not published before that date are copyrighted under rules 2 and 3 above, except that in no case will the copyright on a work not published prior to January 1, 1978 expire before December 31, 2002. (This rule copyrights a lot of manuscripts that we would otherwise think of as public domain because of their age.)

5. If a substantial number of copies were printed and distributed in the U.S. without a copyright notice prior to March 1, 1989, the work is in the public domain in the U.S.

Caveat: Every time a substantially new edition is created, especially if it is a new translation or done by a new editor, a new work is created, so you count from the creation of that edition, not from the creation of the original.

9. Poplawski, Edward G., Outline, Intellectual Property Liability and Jurisdiction on the Internet, 1997, private communication.

10. Ybarra, Michael J., The Internet Hits a Glitch Called Copyright Law, The Los Angeles Times, May 29, 1997.

11. Chemical Rubber Publishing Company, CRC Press, Inc., 1920-1981

12. The Scofield Reference Bible, Oxford University Press, New York. 1945.

13. Feist Publications, Inc. v. Rural Telephone Service Co., Inc., No. 89-1909, Supreme Court of the United States, Decision March 27, 1991.

14. Helm, Op. cit., p. 9

15. Copyrights Act. Op. Cit.

16. http://www.geom.umn.edu/events/courses/1996/cmwh/Copyright/vpublic.html

17. Chemical Rubber Publishing Company, CRC Press, Inc., 1920-1981

18. Feist Publications, Inc. Op. cit. 103(c): "Rural's white pages do not meet the constitutional or statutory requirement for copyright protection. While Rural has a valid copyright in the directory as a whole because it contains some forward text and some original material in the yellow pages, there is nothing original in Rural's white pages. The raw data are uncopyrightable facts, and the way in which Rural selected, coordinated, and arranged those facts is not original in any way. Rural's selection of listings -- subscribers' names, towns, and telephone numbers -- could not be more obvious and lacks the modicum of creativity necessary to transform mere selection into copyrightable expression. . . .Moreover, there is nothing remotely creative about arranging names alphabetically in a white pages directory. It is an age-old practice, firmly rooted in tradition and so commonplace that it has come to be expected as a matter of course."

19. Six lectures on the various forces of matter. Bence-Jones, Henry. "The life and letters of Faraday. 2nd ed. London, Longmans, Green, 1870.

20. Feist Publications, Inc. v. Rural Telephone Service Co. Op. Cit. (c) "While Rural has a valid copyright in the directory as a whole because it contains some forward text and some original material in the yellow pages, there is nothing original in Rural's white pages. The raw data are uncopyrightable facts, and the way in which Rural selected, coordinated, and arranged those facts is not original in any way."

21. For Chemical Synonyms one might choose as keywords "chemical synonyms," "synonyms of chemicals," "chemical names", "names of common substances," "names of chemicals," "chemical formulas."