News and Views from a Mass Spectrometry Lab in Irapuato, Mexico

γνῶθι σεαυτόν (know thyself)

Citation reports and lab presentations are the daily bread of a scientist. To make life a little less painful, I wrote an R markdown script that pulls my bibliographic data from the ORCID and Crossref databases, and creates a word cloud and two tabular reports. The R markdown script runs in RStudio. For my ORCID identifier ('0000-0001-6732-1958') the word cloud presented in the blog article graphics is produced. This image is exported to a png file. The citation summary, based on Crossref data, gives realistic data (a bit lower than Google Scholar, and similar to ISI Web of Knowledge analyses):

  publications sum_citations H_factor
Winkler, Robert 82 1439 21

Co-authors are sorted, according to the number of co-authored papers:

  orcid_ident orcid_coauthor_names coauthor_freq
1 0000-0002-0367-337X Hertweck, Christian 12
2 0000-0001-8228-7663 Moreno-Pedraza, Abigail 8
3 0000-0002-4908-8934 Martínez-Jarquín, Sandra 7
4 0000-0001-8041-8586 Gamboa Becerra, Roberto 5
5 0000-0002-3463-957X García-Lara, Silverio 5
6 0000-0002-7471-9152 Montero-Vargas, Josaphat Miguel 5
7 0000-0002-9493-6402 Kniemeyer, Olaf 5
8 0000-0001-7738-7184 Ordaz-Ortiz, Jose Juan 4
9 0000-0002-6738-9554 López-Castillo, Laura Margarita 4
10 0000-0002-2097-0464 De Vizcaya-Ruiz, Andrea 3
11 0000-0003-0973-9728 Escamilla-Rivera, Vicente 3
12 0000-0001-6230-8092 Délano-Frier, John Paul 2
13 0000-0001-7827-486X Barraza, Aarón 2
14 0000-0001-8037-2856 Partida-Martinez, Laila P. 2
15 0000-0001-8080-5797 Garcia-Cuellar, Claudia Maria 2
16 0000-0002-3600-050X Alvarez-Venegas, Raúl 2
17 0000-0002-4589-6870 Herrera-Estrella, Alfredo 2
18 0000-0002-6028-0425 Hube, Bernhard 2
19 0000-0002-7397-2092 Vergara, Fredd 2
20 0000-0002-8982-5070 Jimenez-Sandoval, Pedro 2
21 0000-0003-1689-7759 Ramirez-Chavez, Enrique 2
22 0000-0003-2371-0307 Molina Torres, Jorge 2
23 0000-0003-4363-7274 de Folter, Stefan 2
24 0000-0001-6203-0342 Ortiz-Martinez, Margarita 1
25 0000-0001-6668-217X Tefsen, Boris 1
26 0000-0001-7096-2713 Garza-Rodríguez, Maria Lourdes 1
27 0000-0001-7700-039X Richter, Ingrid 1
28 0000-0001-8001-6843 Itouga, Misao 1
29 0000-0001-8032-9890 Treutler, Hendrik 1
30 0000-0001-8561-7170 Pedruzzi, Ivo 1
31 0000-0001-9224-0455 Rodríguez-López, Carlos 1
32 0000-0001-9489-2545 Villegas-Sepulveda, Nicolas 1
33 0000-0001-9503-9634 Heinekamp, Thorsten 1
34 0000-0001-9522-0062 Marsch Martinez, Nayelli 1
35 0000-0001-9671-0784 Guntinas-Lichius, Orlando 1
36 0000-0001-9702-9480 Heim, Joel Benjamin 1
37 0000-0002-0309-604X Capella-Gutierrez, Salvador 1
38 0000-0002-0368-6007 Urrea-López, Rafael 1
39 0000-0002-0584-2093 Telle, Sabine 1
40 0000-0002-1056-4665 Gepts, Paul 1
41 0000-0002-1977-0115 Rodríguez Sixtos Higuera, Alicia 1
42 0000-0002-2826-2518 Mayolo-Deloisa, Karla 1
43 0000-0002-2943-7754 Valdés-Santiago, Laura 1
44 0000-0002-3789-1826 Vizuet-de-Rueda, juan carlos 1
45 0000-0002-4187-2863 Feuermann, Marc 1
46 0000-0002-4241-2674 Vlasova, Anna 1
47 0000-0002-4459-9727 Braun, Hans-Peter 1
48 0000-0002-4509-7964 Abud-Archila, Miguel 1
49 0000-0002-5321-1763 Rito-Palomares, Marco 1
50 0000-0002-5408-4022 Herrera-Ubaldo, Humberto 1
51 0000-0002-6392-170X Azuara-Liceaga, Elisa 1
52 0000-0002-7472-9844 Trevino, Victor 1
53 0000-0002-7727-6967 Rosas Román, Ignacio Raúl 1
54 0000-0002-8062-6999 von Eggeling, Ferdinand 1
55 0000-0002-8132-8651 Brunkhorst, Frank Martin 1
56 0000-0002-8442-514X González-González, Mirna 1
57 0000-0002-9061-1061 Glöckner, Gernot 1
58 0000-0002-9455-0796 Krewinkel, Albert 1
59 0000-0002-9751-6702 Shelest, Ekaterina 1
60 0000-0003-1361-5162 Tamez-Pena, Jose 1
61 0000-0003-1679-3247 Doyle, Sean 1
62 0000-0003-1939-9181 Diaz Flores, Maria Fernanda 1
63 0000-0003-1948-3391 Werner, Ernst R. 1
64 0000-0003-2469-3238 Ayora-Talavera, Teresa del Rosario 1
65 0000-0003-3156-5779 Lozano Garcia, Omar 1
66 0000-0003-3186-0292 Kavanagh, Kevin 1
67 0000-0003-3229-1378 White, Theodore 1
68 0000-0003-3888-0931 Weiskirchen, Ralf 1
69 0000-0003-4163-7334 Imhof, Diana 1
70 0000-0003-4193-2720 Delaye, Luis 1
71 0000-0003-4322-0145 Santos, Herbert 1
72 0000-0003-4494-1446 Mejía-Giraldo, Juan Camilo 1
73 0000-0003-4839-3117 Deufel, Thomas 1

These tables are saved in CSV format and can be easily imported into spreadsheet programs (e.g. LibreOffice or Microsoft EXCEL) for producing charts. The R markdown script can be downloaded from GitHub.

As you may notice, the supervisor of my PhD and PostDoc time still occupies the first place of co-authors, which demonstrates that my time in the Hertweck lab was really productive. The following ranks are very successful M.Sc. and PhD students that worked in my lab, as well as my best internal and external collaborators. Listing the publications with internal co-authors demonstrates the vital role of my lab for the institution. Besides, this list helps me to spot productive collaborators for fund proposals and future research projects. You always should bet on the fastest horse and the winning team!

Word cloud and bibliographic analyses are based on your global output. However, they might reflect, how other researchers might perceive your work. But you also can modify the R script and search by intervals. For example, to compare the most frequent article title words during your PhD time and after getting an independent position as a group leader.

Observing the development of your research focus during your academic life can be an extraordinary experience; give it a try!


Colemak: The ideal keyboard layout for multilingual scientists

As scientists, we frequently need special symbols such as ±, ‰, µ. Many researchers also have to write in multiple languages. As a Bavarian/German that works in Mexico and publishes principally in English, I switch continuously between typing in German, Spanish and English, meaning that I need letters on the keyboard that are typical for those languages, such as ñ, é, ü, ß etc. As you might know, dealing with a considerable variety of symbols on a standard keyboard layout is tricky. In contrast, the Colemak keyboard layout provides shortcuts for commonly used letters in the Latin writing system.
But the real reason, why I changed to the Colemak keyboard layout, was a different one: at a very vivid conference dinner of the Mexican Society of Biochemistry, I cut a tendon of my left index finger. As a consequence, I required two surgeries and more than 100 sessions of physiotherapy. Nevertheless, a complete recovery of the functionality was not possible. Prolonged typing was painful, and therefore I looked for solutions. The first steps were a competent dictation software (Swype and Dragon) and a mechanical keyboard (Corsair Strafe with blue Cherry switches). Dictation of e-mails and typing with a high-quality keyboard was already a significant improvement toward ergonomic writing. But still, I had to move my stiff finger for every 't', motivating me to try alternative keyboard layouts. Finally, I discovered the Colemak layout, which is designed for optimizing the finger movements and for multilingual writing. Compared to other ergonomic arrangements such as DVORAK, fewer keys are different, which facilitates the learning. The Colemak webpage provides training strategies and software recommendations.
Sometimes you want to cheat and see the keys (e.g. for typing your complicated password)? In this case, you either can define a key for switching keyboard layouts (I am using the [Win] key to change between Colemak and US layout). Or, you re-label the keyboard with stickers. Thanks a lot to Roman Glinnik, who created high-quality labels for Colemak!
Of course, switching to a new keyboard layout reduces your typing speed for several days or weeks. But in the long term, you write faster and reduce your risk of repetitive strain injury (RSI).

Use artificial intelligence!

Search engines, shopping platforms and social media networks make great use of algorithms for identifying patterns in massive data sets, and help us to find relevant information, products and friends. In stark contrast, many scientists 'do not trust' artificial intelligence. Of course, hypothesis-driven research, according to Popper's Scientific Method (Popper, Karl R. 1959. The Logic of Scientific Discovery. Oxford, UK: Basic Books.), is still the gold standard. But how should we deal with data from exploratory projects such as genomics, proteomics, metabolomics, etc.?
Data Mining methods combine general statistics with machine learning and help us to detect important variables and associations. Further, we can build predictive models for classification or quantification. For mass spectrometry data sets, we found notably the Random Forest Tree algorithm useful since it performs well with noisy data and relatively few samples. Thus, next time you analyse a few thousand variables from a dozen samples, you should try Rattle, free Data Mining software available from https://rattle.togaware.com/. This R package implements various algorithms such as Decision Tree, Random Forest Tree, Ada Boost, Support Vector Machine and Neuronal Networks. The Graphical User Interface of Rattle is human-friendly and also suitable for beginners (BTW: Graham Williams, the author of Rattle, works at the Australian Taxation Office).
Soon (~March 2020) we will publish our RSC book "Processing Metabolomics and Proteomics Data with Open Software: A Practical Guide" (http://pubs.rsc.org/bookshop/collections/series?issn=2045-7545). The co-authors Miguel Reboiro-Jato, Daniel Glez-­Peña and Hugo López-­Fernández contributed a chapter about "Statistics, Data Mining and Modeling", demonstrating with code examples various advanced strategies for data processing, such as self-organising maps (artificial neural networks), biomarker discovery and predictive machine learning models.
Thus, enter the next level of Omics data analysis and use artificial intelligence!

Academic writing: docx, LaTeX or markdown?

Today I wrote my productivity report for 2019. Of course, I was using the official .doc (!) format of our institution. As well, I wrote four letters with my .docx and .odt templates. Word (https://products.office.com/word), LibreOffice (https://www.libreoffice.org/) and similar "What You See Is What You Get" (WYSIWYG) word processors are ideal for quickly creating simple documents.
However, for scientific manuscripts or dissertations, you can use more advanced programs. The reference for academic typesetting is LaTeX (https://www.latex-project.org/). Writing a text is somehow similar to programming, and the produced PDF files are ready to publish. However, little errors, such as a forgotten '}' may cause hours of bug hunting and drive away beginners. Online services, such as Overleaf (https://www.overleaf.com/) greatly facilitate the use of LaTeX and collaborative working. LaTeX is the first choice for people who aim towards correctly set equations and beautiful outcomes.
In the last years, writing texts in markdown is becoming increasingly popular. The plain text files contain only a few special commands that define the structure, such as '#' for a section heading, or formatting, such as '**' for bold text. Thus, the syntax is easy to learn, and the files can be opened with any text editor. There are exclusive markdown "What You See Is What You Mean" (WYSIWYM) editors such as Ghostwriter (https://wereturtle.github.io/ghostwriter/), but also working with more nerdy editors such as Atom (https://atom.io/) or Vim (https://www.vim.org/) is possible. Several programs already provide exporting the markdown text to docx, odt, html or pdf. Pandoc (https://pandoc.org/) enables the use of custom templates, e.g. the latex or docx style templates of journals. Markdown files are very light-weight, cross-platform compatible and mobile-friendly.
If you are interested and look for more information:
Krewinkel A, Winkler R. 2017. Formatting Open Science: agilely creating multiple document formats for academic manuscripts with Pandoc Scholar. PeerJ Computer Science 3:e112, https://doi.org/10.7717/peerj-cs.112

(Bio)chemistry is not sexy

This is quite clear to me. Although I try to convince students, colleagues and executives that molecules are the basis of life, since they provide energy (ATP), store information (DNA), enable movement (actin), kill enemies (chloramphenicol), control libido (testosterone; see fig.), etc., they seem to be boring, and knowledge about (bio)chemistry irrelevant. Yes, teaching can be quite frustrating...

Currently, a lot of resources are spent in sequencing genomes of microbes, animals and plants. But is this useful if we do not study their metabolic activity at the same time? I have not seen many success stories of eradicating diseases or improving crops from genomics. We will have to fill the genomes with life.

Be prepared:
The era of (bio)chemistry is just coming!

I am a Scientist; why should I blog?

I got a position as a principal investigator, and my 1-page contact gets renovated automatically as long as I work and avoid stupid mistakes.
My lab is publishing, and my graduate students usually quickly find good jobs.
My research funding is limited (did I mention that my lab is located in Irapuato, Mexico?), but the salaries of all lab members are safe.

So, why should I bother myself with writing a Blog?

A Blog will neither augment my (auto)citation number, nor my income. In contrast, it will cost me time and money.

Well, there are several motivating reasons, such as:

1) (Re)connecting science and society.
Although we receive public funding, most of our results are (in the best case) only available to other researchers. "Paywalls" of scientific publishers, and complicated terminology avoid the knowledge transfer to the general public. Thus, essential findings, such as the causes of the climatic change and the distribution of microplastics in the biosphere are not used efficiently for preventing environmental catastrophes. Possibly beneficial inventions for medicine and agriculture, such as genome editing, could be ignored or rejected by the society if there is insufficient information for a constructive discussion.

2) Presenting alternative technologies.
Do-it-yourself gadgets and open-source software (OSS) packages often can replace commercial devices and software platforms. Free solutions even might offer superior features. Since non-commercial technologies usually have no marketing budget, they are less known. I want to alleviate this situation a little. Thus, if you know about any exciting innovation that could fit into the topics of the Blog, please drop me a notice.

3) Exchanging personal experiences and views.
Every person and career is different. However, I will share some conclusions from my first decade as an independent researcher, e.g. on career planning, research field definition, productive writing, funding, collaborations, and work-life balance. You might even discover that a scientist's life is not for you...

Naturally, any information posted on this Blog is biased by my personal opinion and philosophy. Nevertheless, if you like my Blog, you can follow it; comment, suggest or contribute articles.

Robert Winkler

Page 1 of 1, totaling 6 entries