Wednesday, July 19, 2017

How to use the reference manager Zotero

(Updated 19 July 2017 regarding Zotero 5.)

Inspired by an eMail exchange with a colleague I thought I would write a longer post on how to use the open source reference manager Zotero. Obviously all the information will in some form be available in its documentation, but at least to me the things I really need to look up are often the needle in the haystack of the obvious and the irrelevant. So here is what I think is what one needs to know to start using Zotero in what is hopefully a logical order.

All of this is based on Zotero 4, as I have not yet used the newest, version 5. From version 5 the standalone program is required instead of Zotero running through the browser alone. See the relevant comment under installation below.

How it works

Zotero is available for Win, Mac and Linux and for LibreOffice or MS Word. It integrates into the browser (I use Firefox, but it also seems to work with Chrome, Safari and Opera), and the usual way to use it is through the browser. This means that the browser has to be open at the same time as the word processor, but there is also a standalone version that I am not familiar with.

Installation

To be honest, this is the only point on which I am a bit confused at the moment because it has been a bit since I installed one of my instances. There are three items that may have to be installed: Zotero itself on the download page, for which there are specific instructions. Then there is the "connector" for the browser you have. Finally, it may be necessary to install a word processor plug-in for your browser. What confuses me is that I seem to remember only installing the latter two last time, so either I misremember or something has changed with the newest Zotero version (?).

Either way, while the installation of Zotero into the browser itself is easy, I have noticed that sometimes the word processor plug-in does not take on the first attempt. In that case I merely repeated the installation and restarted everything, and then it worked.

Update: The colleague who has now started using Zotero has, of course, installed the newest version and kindly adds the following:
The new version of Zotero needs the standalone program installed. This is because Mozilla has dropped the engine that supported a lot of extensions (like Zotero). It was deemed that allowing the browser to carry out low-level functions on the host computer introduced inherent vulnerabilities, and so Firefox versions after 48 have very much restricted what the browser is allowed to do (no longer can it communicate directly with databases, and carry out a lot of file handling functions). The problem is not restricted to Zotero, for example Gnome desktop extensions used this functionality and have also had to change the way they do things. The long and the short is that nowadays you need the standalone application installed.

Using Zotero in the browser and building your reference library

When you have Zotero installed there are two new buttons in the browser: a "Z" that opens your reference library and a symbol right next to it that you can click to import a journal article into that library. Given that the library will at first be empty let's look at that latter function first.

The Zotero buttons in the browser
For example, I may have found a journal article through a Google Scholar search. Ideally, I am now looking at the abstract or HTML fulltext on the journal website, because that page will have all the metadata I want. I now click on the paper symbol to the right of the Z, and Zotero automatically grabs all the fields it can find and saves a new entry into my reference library; if it can get a PDF it will even download that.
Viewing paper abstract
Now I click on the Z button to bring up my library, if I didn't have it open yet. In some cases I may notice that something went wrong. The typical scenarios are that the title of the paper is in Title Case or ALL CAPS. This is easily rectified by right-clicking on the title field and selecting "sentence case".
A reference has been added to the library
If something needs to be edited manually we can do so by left-clicking onto the relevant field. For paper titles, the usual problems would be having to re-capitalise names after correcting for Title Case or adding the HTML tags for italics round organism names. In the present case, however, I find that the title contains HTML codes for single quotation marks instead of the actual quotation mark characters, so I quickly correct that. Manual entry is, of course, also possible for an entire reference, for example if it isn't available online. In that case simply click on the plus in the green circle and select the appropriate publication type.

So much for importing references. It is also possible to bulk-import from the Google Scholar search results, but I would not recommend that as Google sometimes mixes up the metadata.

The style repository

The first time we try to add a reference to a manuscript, we are asked what reference style should be used. Zotero comes with only a few standard styles installed, but many more are available at the Zotero style repository. One of the in my eyes few downsides of Zotero is that it has less styles than Endnote, but often it is possible to get the relevant one under a different name. If, for example, you are preparing a manuscript for PhytoTaxa the ZooTaxa style should serve just as well.

Installing a new style is as easy as finding it in the style repository, clicking on its name, and confirming that it should be installed.

Selecting a reference style
Using Zotero in the word processor

Again, note that the browser needs to be running while we are adding references to a paper. The following assumes LibreOffice, but except for where to find the buttons everything is the same in MS Word.

In LibreOffice you will have new buttons for inserting and editing references, for inserting the reference list, and for changing document settings, in particular the reference style. To insert a reference, click on the button that seems to read r."Z. You can now enter an author name or even just a word from the title, as in my example here, and Zotero will suggest anything that fits.
Adding a reference to a manuscript
Another downside of Zotero, at least as of version 4 which I am still using, is that it doesn't do a reference like "Bronzati (2017)". Instead you can either have "(Bronzati 2017)" or reduce the reference to "(2017)". For this click on the reference in the field where you were asked to select it (if you have already entered it simply use the edit reference button showing r." and pencil) and select "suppress author". Then you have to type the author name(s) yourself outside of the brackets, which is obviously a bit annoying.

Author names outside of brackets have to be added manually
Once we have added a few references, we obviously need to add the reference list. This is as easy as clicking the third button in the Zotero field. The only others that are usually important are the two arrows (refresh) to update the reference list (although it does so automatically when the document is reloaded) and the cogwheel (document preferences) that allows changing the reference style across the document.

In LibreOffice I have sometimes found that adding or updating references changes the format of the entire paragraph they are embedded in. This seems to happen if the default text style is at variance with the text format actually used in the manuscript. Selecting a piece of manuscript text and setting the default style to fit its format has always rectified the situation for me.

Syncing

It is useful to get an account at the Zotero website and use it to sync one's reference library across computers. Again, this works cross-platform. I do it between a Windows computer at work and my personal Linux computer at home. Note, however, that it only syncs the metadata, not any fulltext PDFs that have been saved.

To sync, go into the browser and click on the Z symbol to open Zotero. Now click the cogwheel and select preferences. The preferences window has a sync tab where you can enter your username and password. Do the same on two computers and they should share their reference libraries.

Sunday, July 16, 2017

Free-association word salad is not the same as analysis

I find it remarkable what kinds of pieces are sometimes published by otherwise serious news organisations. Today during breakfast I made the mistake of trying to read something hilariously filed under "analysis" at the ABC website, Are we sleepwalking to World War III?

It starts with the claim that WW3 is coming and that Australia will be invaded:
All certainty will be lost, our economy will be devastated, our land seized, our system of government upended.
It is backed up by what a single former military commander said to the author over lunch:
This isn't mere idle speculation or the rantings of a doomsday cult, this is the warning from a man who has made it his life's work to prepare for just this scenario.
I may be missing something here, but unless there is a bit more at least circumstantial evidence I would still file this warning as mere speculation; that is kind of what the word means.

Then the author randomly quotes out of context Mark Twain ("History doesn't repeat but it does rhyme"), Alexis de Tocqueville as writing that the French Revolution was inevitable, a claim that can very conveniently be made about any historical event after the fact because it is always untestable, and then quickly moves to a historian's work on the beginning of World War I (while spelling the name of that source in two different ways).

In this latter case at least an actual argument can be discerned: Britain and Germany were trade partners and still went to war, so we should not assume that two countries today would stay at peace just because they are trade partners.

The author accelerates his already breathtaking pace to name-check a Harvard scholar and, before that person gets to say anything useful, the Ancient historian Thucydides. He seems to imply that the USA might be forced into starting WW3 to stop the rise of China, as Sparta was forced to start the Peloponnese War when Athens became too powerful. (I read Thucydides years ago, and I seem to remember it was a bit more complicated than that.)

The text descends into gibberish for a bit:
Any clash between the US and China is potentially catastrophic, but as much as we may try to wish it away, right now military strategists in Beijing and Washington are preparing for just an eventuality.
Perhaps: "just such an eventuality"?
Global think tank the Rand Corporation prepared a report in 2015 for the American military, its title could not have been more direct -- War with China: Thinking Through the Unthinkable.
Yeah, that's the job of strategists and (serious) think tanks.
It concluded that China would suffer greater casualties than the US if war was to break out now. However, it cautioned, that as China's military muscle increased so would the prospect of a prolonged destructive war.
How... what... huh? If I picked a fight with my neighbor now, I could be hurt, BUT (!) if I picked the fight an hour later, the fight could take longer. That doesn't even begin to make sense as a sentence. Even if we try to speculate about what the author may have meant here, for example that China would lose a war now but may have a better chance of winning a few decades in the future, one would have to point out that suffering greater casualties may not be incompatible with winning now either, cf. USSR in WW2. Also, why interrupt the sentence with a comma after the main verb? Did nobody proof-read this?

Having established to his satisfaction that war could happen, the author now moves to the question what precise incident could precipitate WW3 in Asia. Again a historian is cited, and again only so superficially that it is impossible for the reader to judge if what they say can be backed up. The islands of the South China Sea and other islands disputed between China and Japan are mentioned as the most likely causes of war. Okay, so I am not a military strategist, and I appreciate how useful symbolic conflicts can be to fire up nationalism when a government is in domestic trouble, but are these really the kinds of issues where a government would say, hey, let's needlessly blow up our entire economy and get hundreds of thousands killed over a practically worthless heap of rock? (Or sand, as the case may be.)

But of course we have to move on immediately. Cyber warfare! Thucydides! (Again.) Name-checking a Chinese scholar who does think that China and the USA are too economically interdependent to go to war, so at least we have an isolated counterpoint. Then the former military commander from the beginning opines that it would be helpful if politicians would also consider the risks of going to war; I am sure nobody in the history of humanity has ever had that idea before.

The piece ends with the author claiming to be more optimistic than his interview partner, only to end on a very depressing note. He takes this as an opportunity to quote Shakespeare, I presume in case the mention of Twain, de Tocqueville and Thucydides wasn't enough to signal deep erudition.

Now don't get me wrong, I am also rather pessimistic about the future. Overpopulation, resource limits and climate change may well combine to throw the world into a new dark age, with starvation, mass migrations, widespread collapse of most institutional order, and warlords duking it out with the few Byzantine Empire-like islands of stability that are left.

But that is how I would expect a serious analysis of future trends to look like: citing empirical evidence of risk factors like crop failures, water availability or shifting alliances and how they can produce unsolvable dilemmas for all involved. Merely name-checking historians in a meandering, stream-of-consciousness text without any real information or data isn't it.

Thursday, July 13, 2017

Botany picture #248: Gleichenia dicarpa


This is one of my favourite fern photographs: Gleichenia dicarpa (Gleicheniaceae) forming a large thicket at Jervis Bay, Australia, 2011.

While the group does not occur in Germany I have seen quite a few Gleicheniaceae during field work in South America. They are often aggressive colonisers, especially after major disturbances such as landslides, but are said to be very difficult or impossible to cultivate. Also, while I did my PhD there was another PhD student at the same institute who conducted a taxonomic revision of a genus of Gleicheniaceae in the Neotropics, so all things considered these odd-looking ferns were not new to me when I arrived on this continent.

The specific epithet of Gleichenia dicarpa means "two-fruited". Obviously ferns do not have fruits, but this is a reference to the fact that each little pocket on the lower leaf side contains two, and only two, tiny sporangia when the plant is fertile. Given that ferns often produce clusters (sori) of numerous sporangia this low number is rather peculiar in itself.

Sunday, July 9, 2017

What philosophy is "good for"

There is a very strange discussion popping up from time to time in some of the blogs that I read, where somebody will claim that philosophy is useless because it has not contributed anything to our understanding of the natural world or "to society" in recent times. Although I think that the charge of scientism - empirical science is all we need, every other field of scholarship is useless - is usually, mostly a straw man, it seems that there is a vocal minority of people who really think like that.

For starters, to the degree that this is about philosophy contributing to our understanding of the natural world this is clearly the wrong question to ask. What have bus drivers, as a profession, lately contributed to that endeavour? Nothing; but that does not mean the profession is useless, merely that it has a different job. Conversely, everybody who does contribute to our understanding of nature is by definition a scientist, so the claim that only scientists directly contribute to that understanding is true but trivially so.

The question could then rephrased more generally as: what do philosophers actually do? What is philosophy good for?

Now I am not a philosopher myself, and the question would perhaps be best answered by a member of that profession. But it so happens that just before I saw that remarkably nihilistic discussion about the value of philosophy I saw a use of philosophy outside of the academic context that provides a very good example of the kind of thing that the field is "good for".

In this post on his website Why Evolution Is True, Jerry Coyne had taken a completely consequentialist stand on the issue of punishment:
If you're a determinist about behavior and a consequentialist about punishment, as I am, then you punish people only if it's for the good of society. (My view is that at the moment of the slaughter, Gutierrez had no "choice" to not kill the birds.)

And there are three social goods to come from punishments like incarceration: deterrence of others, sequestration of someone who could be dangerous to society, and reformation of a criminal so he doesn't repeat his offense when freed. All three of these apply to Gutierrez: jailing him will probably deter others who want to kill wild animals, people who do that tend to be murderous psychopaths who could kill again (maybe people next time) and so need to be put away, but such people may be susceptible to reformation [...].

If none of these reasons obtain, there's no reason to imprison anyone; or can you give me one? But surely deterrence and sequestration apply in most cases--though not capital punishment, which data show isn't a deterrent. And if no social good results from imprisonment, in what sense would Gutierrez still "deserve" to be imprisoned? To satisfy a sense of vengefulness? That, to me, is not a good reason, for it caters to our baser instincts--the same instincts and feelings that make people favor executions. So, if Gutierrez can be reformed, poses a danger to society, or can be a deterrent to others, yes, he "deserves" punishment. But he doesn't deserve it just because he needs to be "paid back" for what he did.
In short, locking somebody up is to be justified (only) by good societal outcomes, while that person "deserving" to be locked up is not a just and reasonable concept (because JC believes that the existence of cause-and-effect is incompatible with personal responsibility). To this the commenter cjwinstead replied as follows:
Suppose we have strong justification to believe that punishing Gutierrez's mother will satisfy the goals of deterrence and reformation; and keeping her hostage would be as effective as sequestration (maybe he really cares a lot about his mother). If we have evidence that this will be more effective in those goals, is there any reason not to punish her? What if she gladly volunteers to receive the punishment on his behalf? I would say that Christian Gutierrez deserves to be the subject of punishment in a way that his mother does not. Proxy punishments do happen in our justice system, and they are arguably effective at deterrence and reformation. Should they be supported if they work?
This, right there, is one of the things that philosophy is "good for". This is not science, obviously, as no empirical data are involved in any but the most remote ways. What cjwinstead has done is propose a thought experiment - a classical method of analytic philosophy - to lay bare our instincts about something (here, that we would consider punishing the mother unjust), to start a conversation about where those instincts come from and what, if anything, they mean to us, and perhaps in particular to demonstrate the absurd consequences of a position (here, basing moral philosophy entirely on consequentialism).

Of course, you may disagree with cjwinstead in this instance. What is more, while his comment sparked a very long discussion, nearly all the people who replied to it missed its point in a way that is somewhere between spectacular and hilarious. But again, this is the kind of thing that is philosophy, and it is useful and necessary to hash out issues that cannot be adjudicated based on empirical studies alone.

We may do science to find out if deterrence and reformation work or not, but science alone cannot necessarily tell us if we should prefer consequentialism to deontology, for example. And even if some scientismist were to argue that it can, using analytic philosophy to point out an internal contradiction or absurdity in an argument still saves us the major investment of conducting a large scientific study to test it.

Saturday, July 8, 2017

Botany picture #247: Androsace villosa


Androsace villosa (Primulaceae), France, 2014. A cute little alpine plant whose leaf rosettes remind me somewhat of Sempervivum. I assume the colour of the throat, which can even in this photo be seen to vary between red and yellow, signals to insects whether the flower is in the right stage to be visited.

Thursday, July 6, 2017

Multi-access keys need a different approach than dichotomous keys

I am close to deploying a reasonably large online multi-access (Lucid) key and find myself fretting how it will be received by the user community. Obviously people may have different preferences for how exactly a key should look like and what features it should or should not have, but one concern I have in particular is that taxonomists used to writing traditional dichotomous keys may be disappointed with some of the choices I made.

To recapitulate, just in case it isn't immediately clear, there are two very common types of identification keys in systematics. The traditional ones are dichotomous and single-entry, because that is what works in books. As an example, consider the Craspedia key in the KeyBase repository (click on bracketed or indented to see the full key). The user has to start at couplet 1 and then answer one pair of leads after the other.

Crucially, to allow all species to be keyed out in such a dichotomous key the author has to find enough characters so that every single species differs in some clear way from at least one other species. There may consequently be lots of characters mentioned in the key, but it doesn't look that way because few of them are mentioned for all species. In the present case, couplet 9 asks if the leaves are sticky-glandular to differentiate Craspedia adenophora, but the trait is irrelevant for all other couplets because only the leaves of that one species are sticky.

The other, increasingly common type of key is multi-access and electronic. As an example, I have just basically at random clicked on the key to the Restionaceae of Western Australia. The user can enter whatever characters they have at hand in whatever order they want, and the key software will kick out all species that don't match. In this case there are also options to narrow the selection down by geography, flowering time or genus (if already known).

While working on my multi-access key (and a previous one before that) I have had conversations with colleagues on the lines of "what about this character, have you considered using that?" for characters that are sometimes very obscure and accordingly hard on the user or, and that is my main point here, serve only to differentiate a single species.

A character like that is often very important when writing a dichotomous key. Imagine the taxonomist working away, shuffling species around like so and so, perhaps ending up with a stubborn pair of species that clearly go together in the key but are hard to differentiate. And then they realise, ah, one of them has woolly hairs on the bracts, and the other doesn't! We have a contrast!

And that is great. But if, for example, the species with the woolly hairs on the bracts is the only one in the entire group with that trait, then the character works only to differentiate that one species. That is not a problem in the dichotomous key because the character is only presented to the user at the moment where it is actually relevant, while they will never see it in any other part of the key.

But in a multi-access key all the characters will be visible right from the start, even the ones that only work to differentiate a single species from the other 99 or so. And if we try to do that for all species we end up with a hundred characters, plus dozens of characters that differentiate 40 from 60 or suchlike. And now imagine the poor user being faced with a table of a bazillion characters - they won't even know where to start, the key will just look terribly daunting.

There is a reason why the ideal couplet in a dichotomous key is commonly said to mention perhaps two to three characters; when faced with too much choice or too much information at the same time the human brain just goes into Blue Screen of Death mode. For shopping decisions, for example, there seems to be some evidence that consumers are less likely to make a purchasing decision at all if a shop presents too many options.

What is more, the beauty of electronic multi-access keys is that it is not necessary to differentiate all species from each other. Yes, that is necessary in dichotomous keys printed on paper, but in our newfangled multi-access keys it is all about reducing the number of possible species to a comfortable three to five, and then the user can look at pictures and click on links or species profiles to make the final decision.

Well, I shall see what feedback I will get, but what I want to say here is that the habits that work for one type of key cannot simply be transferred onto a completely different type. The user experience would actually suffer from overloading the interface with dozens of characters each of which will hardly ever be useful.

Sunday, July 2, 2017

Seems as if time-calibration must be working to some degree

This week's journal club discussion covered McIntyre et al. 2017, Global biogeography since Pangaea, Proc. R. Soc. B 284: 20170716.

The authors set out to compare estimated continental break-up (and, to a lesser degree, collision) times as estimated from palaeomagnetic data with species divergence times as estimated from phylogenetic analyses using molecular clocks. They selected 42 vertebrate sister taxa for their presumed lack of dispersibility to exclude groups whose distribution may have been influenced by long-distance dispersal. Even among the selected taxa they tried to account for dispersibility by, if I understand correctly, extending the error bars around the divergence times for lineages that seemed more dispersible.

In the end they arrived at a very nice correlation between continental break-up times and divergence times. What does this tell us?

There were some concerns in our group about the argumentation being somewhat circular. I do not actually see that myself; one dataset was palaeomagnetic, and times in the other would presumably have been based on fossils and nucleotide substitution rates, so really two independent data sources would have been compared. (The time-calibrated phylogenies were sourced from the timetree.org database, which I have not yet used myself.)

To the degree that I found the methodology odd it is because of the decision to extend error bars when dispersal was considered somewhat probable. Yes, admittedly the immediately obvious way of identifying confounding dispersal - comparing divergence times against continental break-up times - would be circular in a study explicitly setting out to compare those two; using that approach would have amounted to massaging the data. But I would still find it more logical to have some way of categorically identifying suspected cases of dispersal and kicking them out of the dataset instead of leaving them in but making the relevant data points fuzzy.

What I found most puzzling, however, is that the paper is not actually very clear on what the research question was. It is thus somewhat up to the reader to draw a conclusion. If you already trust time-calibrated phylogenies, you could take the study to confirm the reliability of palaeomagnetic data. If you already believe that palaeomagnetics works but are somewhat skeptical about time-calibrated phylogenies, this study should at least show that molecular clocks can't be that bad after all, otherwise they wouldn't have got such a neat calibration out at the end.

And this is also what I take away from our reading, especially in the light of the criticism of molecular clocks that is still regularly advanced by vicariance biogeographers and panbiogeographers. Yes, this study did show that the fit is pretty good except where there is reason to suspect dispersal. And that brings us to the last point:

The present paper carefully excluded cases of suspected dispersal to examine only cases of vicariance, so the authors must be biogeographers (and geologists) who accept the existence of both long-distance dispersal and vicariance. And the same was true of our journal club. Nobody I know has any problem whatsoever reading a paper that concludes "this pattern is best explained by vicariance" if that is indeed what the data say.

But let's be clear here, it does not work the other way. Just read the papers I discussed a few weeks ago; pan- and vicariance biogeographers generally do have a problem reading a paper that concludes "this pattern is best explained by long-distance dispersal" and will instinctively start questioning the methodology.

The situation is just not symmetrical. The "dispersalist" who tries to explain every pattern with dispersal, no matter what the data say, is a non-existent straw-man. I have never met or read such as colleague. The panbiogeographer who tries to explain every pattern with vicariance, no matter what the data say, does, however, seem to be alive and kicking.