Electronic laboratory notebooks in a public-private-partnership

This report shares the experience during selection, implementation and maintenance phases of an electronic laboratory notebook (ELN) in a public private partnership project and comment on user feedback. In particular, we address which time constraints for roll-out of an ELN exist in granted projects and which benefits and/or restrictions come with out-of-the-box solutions. We discuss several options for the implementation of support functions and potential advantages of open access solutions. Connected to that, we identified willingness and a vivid culture of data sharing as the major item leading to success or failure of collaborative research activities. The feedback from users turned out to be the only angle for driving technical improvements, but also exhibited high efficiency. Based on these experiences, we describe best practices for future projects on implementation and support of an ELN supporting a diverse, multidisciplinary user group based in academia, NGOs, and/or for-profit corporations located in multiple time zones. Abstract 16 This report shares the experience during selection, implementation and maintenance phases of an electronic laboratory notebook (ELN) in a public private partnership project and comment on user 18 feedback. In particular, we address which time constraints for roll-out of an ELN exist in granted 19 projects and which benefits and/or restrictions come with out-of-the-box solutions. We discuss 20 several options for the implementation of support functions and potential advantages of open 21 access solutions. Connected to that, we identified willingness and a vivid culture of data sharing 22 as the major item leading to success or failure of collaborative research activities. The feedback 23 from users turned out to be the only angle for driving technical improvements, but also exhibited 24 high efficiency. Based on these experiences, we describe best practices for future projects on 25 implementation and support of an ELN supporting a diverse, multidisciplinary user group based 26 in academia, NGOs, and/or for-profit corporations located in multiple time zones.


Introduction
28 Laboratory notebooks (LNs) are vital documents of laboratory work in all fields of experimental 29 research. The LN is used to document experimental plans, procedures, results and considerations 30 based on these outcomes. The proper documentation establishes the precedence of results and in 31 particularly for inventions of intellectual property (IP). The LN provides the main evidence in the 32 event of disputes relating to scientific publications or patent application. A well-established routine 33 for documentation discourages data falsification by ensuring the integrity of the entries in terms of 34 time, authorship, and content (Myers 2014). LNs must be complete, clear, unambiguous and 35 secure. A remarkable example is Alexander Fleming's documentation, leading to the discovery of 36 penicillin (Bennett & Chung, 2001). 37 The recent development of many novel technologies brought up new platforms in life sciences 38 requiring specialized knowledge. As an example, next-generation sequencing and protein structure 39 determination are generating datasets, which are becoming increasingly prevalent especially in 40 molecular life sciences (Du & Kofman, 2007). The combination and interpretation of these data 41 requires experts from different research areas (Ioannidis et al., 2014), leading to large research 42 consortia. 43 In consortia involving multidisciplinary research, the classical paper-based version of a LN is an 44 impediment to efficient data sharing and information exchange. Most of the data from these large-45 scale collaborative research efforts will never exist in a hard copy format, but will be generated in 46 a digitized version. An analysis of this data can be performed by specialized software and dedicated 47 hardware. The classical application of a LN fails in these environments. It is commonly replaced 48 by digital reporting procedures, which can be standardized (Handbook: Quality practices in basic 49 biomedical research, 2006) (Bos et al., 2007) (Schnell, 2015). Besides the advantages for daily 50 operational activities, an electronic laboratory notebook (ELN) yields long-term benefits regarding 51 data maintenance. These include, but are not limited to, items listed in Table 1 (Nussbeck et al., 52 2014). The order of mentioned points is not expressing any ordering. Beside general tasks, 53 especially in the field of drug discovery some specific tasks have to be facilitated. One of that is 54 functionality allowing searches for chemical structures and substructures in a virtual library of 55 chemical structures and compounds (see table 1, last item in column "Potentially"). Such a 56 function in an ELN hosting reports about wet-lab work dealing with known drugs and/or 57 compounds to be evaluated, would allow dedicated information retrieval for the chemical 58 compounds or (sub-) structures of interest. 59 Interestingly, although essential for the success of research activities in collaborative settings, the 60 above mentioned advantages are rarely realized by users during daily documentation activities and 61 institutional awareness in academic environment is often lacking.
62    104 For the first step, we had to manage a large and highly heterogeneous user group ( 105 Figure 3) that would be using the ELN scheduled for roll out within 6 months after project launch. 106 All personnel of the academic partners were requested to enter data into the same ELN potentially 107 leading to unmet individual user requirements, especially for novices and inexperienced users.
108 As a compromise for step 1 (Error! Reference source not found.), we assembled a collection of 109 user requirements (URS) based on the experiences of one laboratory that had already implemented 110 an ELN. We further selected a small group of super users based on their expertise in documentation 111 processes, representing different wet laboratories and in silico environments. The resulting URS 112 was reviewed by IT and business experts from academic as well as private organisations of the 113 consortium. The final version of the URS is available as a supplement (Article S1).
114 In parallel, based on literature (Rubacha, Rattan & Hossel, 2011) and Internet searches, 115 presentations of widely used ELNs were evaluated to gain insight into state-of-the-art ELNs. This 116 revealed a wide variety of functional and graphical user interface (GUI) implementations differing 117 in complexity and costs. The continuum between simple out-of-the-box solutions and highly 118 sophisticated and configurable ELNs with interfaces to state-of-the-art analytical tools were 119 covered by the presentations. Notably, the requirements specified by super users also ranged from 120 "easy to use" to "highly individually configurable". Based on this information it was clear that the 121 ELN selected for this consortium would never ideally fit all user expectations. Furthermore, the 122 exact number of users and configuration of user groups were unknown at the onset of the project. 123 The most frequently or highest prioritized items of the collected user requirements are listed in 124 Table 2. We divided the gathered requirements into 'core' meaning essential and 'non-core' 125 standing for 'nice to have, but not indispensable'. Further, we list here only the items, which were 126 mentioned by more than two super users from different groups. The full list of URS is available 127 as a supplement (Article S1).
128 Table 2: Overview of user requirements organized as 'core user requirements' for essential items, and 'non-core user 129 requirements' representing desirable features.
Core user requirements Non-core user requirements  System set-up and implementation should be fast and simple  Access from different platforms should be possible: Windows, Linux, Mac OS  Low training requirements (for high level of acceptance)  Hosted system with state-of-the-art security settings  Simple user management (only limited support by project members possible)  Suitable for both chemical (including e.g. drawings of molecules) and biological (including e.g. capture fluorescent images) experiments  Low costs, especially for long-term usage in the academic area  Conform with Good Laboratory Practise  User management with dedicated access permissions (expectation: all users working on the same project, but in different work packages)  Workflow management  Order management  Chemical structure handling  Dedicated tree structure for storing experiments  Legally-binding procedures (signatures)  Modular expandability  Appropriate integrated analytical features  Social networking and collaborative (chat) features  Storage for large sets of "raw" data for reanalysis 130 131 Based on the user URS, a tender process (Step 2, Figure 2) was initiated in which vendors were 132 invited to respond via a Request for Proposal (RFP) process. The requirement of the proposed 133 ELN to support both chemical and biological research combined with the need to access the ELN 134 by different operating systems (Windows, Linux, Mac OS) (see Figure 2) reduced the number of 135 appropriate vendors. Their response provided their offerings aligned to the proposed 136 specifications.
137 Key highlights and drawbacks of the proposed solutions were collected as well as approximations 138 for the number of required licenses and maintenance costs. The cost estimates for licenses were 139 not comparable because some systems require individual licenses whereas others used bulk 140 licenses. At the time of selection, the exact number of users was not available. 141 Interestingly, the number of user specifications available out of the box differed by less than 10% 142 between systems with the lowest (67) and highest (73) number of proper features. Thus, highlights 143 and drawbacks became a more prominent issue in the selection process.
144 For the third step, (Figure 2), the two vendors meeting all the core user requirements and the 145 highest number of non-core user requirements were selected to provide a more detailed online 146 demonstration of their ELN solution to a representative group of users from different project 147 partners. In addition, a quote for 50 academic and 10 commercial licenses was requested. The 148 direct comparison did not result in a clear winner -both systems included features that were 149 instantly required, but each lacked some of the essential functionalities.
150 For the final decision only features which were different in both systems and supported the PPP 151 were ranked between 1=low (e.g. academic user licenses are cheap) and 5= high (e.g. cloud 152 hosting) as important for the project. The decision between the two tested systems was then based 153 on the higher number of positively ranked features, which revealed most important after 154 presentation and internal discussions of super users. The main drivers for the final decision are 155 listed in Table 3. In total, the chosen system got 36 positive votes on listed features meeting all 156 high ranked demands listed in table 3, while the runner up had 24 positive votes on features. 157 However, if the system had to be set up in the envisaged consortium it turned out to be too 158 expensive and complex in maintenance.

: List of drivers for final decision about which ELN-solution to be set up and run in the project consortium.
Main drivers for the final decision  Many end users were unfamiliar with the use of ELNs, therefore the selected solution should be intuitive  Easy to install and maintain  Minimal user training required  Basic functionalities available out-of-the-box (import of text, spreadsheet, pdf, images and drawing of chemical structures), with as few configurations as possible  ELN does not apply highly sophisticated checking procedures, which would require a high level of configuration, restricting users to apply their preferred data format (users should take responsibility for the correct data and the correct format of the data stored in the ELN instead)  Web interface available to support all operating systems to avoid deploying and managing multiple site instances  Vendor track record -experience of the vendor with a hosted solution as an international provider  Sustainability: · affordable for academic partners also after the five year funding duration of the project (based on the maturity of the vendor, number of installations/users, the state-ofthe-art user interface and finally also the costs) · Easy to maintain (minimal administrative tasks, mainly user management) · Support of different, independent user groups · Configurable private and public templates for experiments Main drivers for the final decision  Proposed installation timeframe  Per user per year costs for academics and commercial users 160 161 The complete process, from the initial collection of URS data until the final selection of the 162 preferred solution, took less than 5 months.
163 Following selection, the product was tested before specific training was offered to the user 164 community (Iyer & Kudrle, 2012). Parallel support frameworks were rolled out at this time, 165 including a help desk as a single point of contact (SPoC) for end users.
166 The fourth step (Figure 2) was the implementation of the selected platform. This deployment was 167 simple and straightforward because it was available as Software as a Service (SaaS) hosted as a 168 cloud solution. Less than one week after signing the contract, the administrative account for the 169 software was created and the online training of key administrators commenced. The duration of 170 training was typically less than 2 hours including tasks such as user and project administration.
171 To accomplish step 5 (Figure 2), internal training material was produced based on the experiences 172 made during the initial introduction of the ELN to the administrative group. This guaranteed that 173 all users would receive applicable training. During this initial learning period, the system was also 174 tested for the requested user functionalities. Workarounds were defined for missing features 175 detected during the testing period and integrated into the training material.  Define type of data which should be placed in the ELN (e.g. raw data, curated data) 184 We did not define specific data formats since we could not predict all the different types of data 185 sets that would be utilized during the lifetime of the ELN. Instead, we gave some best practise 186 advice on arranging data (Table S3 and Table S4) facilitating its reuse by other researchers. The 187 initially predefined templates, however, were only rarely adopted, also some groups are using 188 nearly the same structure to document their experiments. More support especially for creating 189 templates may help users to document their results more easily.
190 For the final phase, the go live, high user acceptance was the major objective. A detailed plan was 191 created to support users during their daily work with the ELN. This comprised the setup of a 192 support team (the project specific ELN-Helpdesk) as a single point of contact (SPoC) and detailed 193 project-specific online trainings including documentation for self-training. As part of the 194 governance process, we created working instructions describing all necessary administrative 195 processes provided by the ELN-Helpdesk.
196 Parallel to the implementation of the ELN-Helpdesk, a quarterly electronic newsletter was rolled 197 out. This was used advantageously to remind potential users that the ELN is an obligatory central 198 repository for the project. The newsletter also provided a forum for users to access information 199 and news, messaging to remind them the value of this collaborative project.
200 Documents containing training slide sets, frequently asked questions (FAQs) and best practice 201 spreadsheet templates ( 209 Technical solution 210 The selected ELN, hosted as a SaaS solution on a cloud-based service centre, provided a stable 211 environment with acceptable performance, e.g. login < 15 sec, opening an experiment with 5 pages 212 < 20 sec (for further technical details please see Supplementary file S1). During the evaluation 213 period of two years, two major issues emerged. The first involved denial of access to the ELN for 214 more than three hours, due to an external server-problem, which was quickly and professionally 215 solved after contacting the technical support, and the other was related to the introduction of a new 216 user interface (see below) 217 The administration was simple and straightforward, comprising mainly minor configurations at 218 the project start and user management during the runtime. One issue was the gap in communication 219 regarding the number of active users causing a steady increase in number of licences. 220 A particular disadvantage using the selected SaaS solution concerned system upgrades. There was 221 little notice of upcoming changes and user warnings were hidden in a weekly mailing. To keep 222 users updated, weekly or biweekly mails about the ELN were sent to the user community by the 223 vendor. Although these messages were read by users initially, interest diminished over time. 224 Consequently, users were confused when they accessed the system after an upgrade and the 225 functionality or appearance of the ELN had changed. On the other side, system upgrades were 226 performed over weekends to minimize system downtime.
227 The costs per user were reasonable, especially for the academic partners for whom the long-term 228 availability of the system, even after project completion, could be assured. This seemed to be an 229 effect of the competitive market that caused a substantial drop in price during the last years.
230 User experience 231 In total, more than 100 users were registered during the first two years runtime, whereas the 232 maximum number of parallel user accounts was 87, i.e.   266 Overall, the regular documentation of experiments in the ELN appeared to be unappealing to 267 researchers. This infrequent usage prompted us to carry out a survey of user acceptance in 268 May/June 2015 (detailed description of analysis methods including KNIME workflow and raw 269 data are provided in Article S2). The primary aim was to evaluate user experiences compared to 270 expectations in more detail. In addition, the administrative team wanted to get feedback to 271 determine what could be done to the existing support structure to increase usage or simplify the 272 routine documentation of laboratory work. Overall, 77 users (see Table 4) were invited to 273 participate in the survey. Two users left the project during the runtime of the survey. We received 274 feedback from 60 (=80%) out of the remaining 75 users. 2 questionnaires were rejected due to less 275 than 20% answered questions. The number of evaluated questionnaires is 58. see Supplementary 276 file S2.
277 279 There are also some limitations of the survey which should be discussed. The number of invited 280 active ELN users was low (n=77), thus we refused to collect detailed demographic data in order to 281 ensure full anonymity of the participants, so we expected a higher participation and especially 282 more detailed answers to the free text questions. In addition, some interesting analysis could not 283 be answered by of the questionnaire due to the low number of returned forms (n=58). E.g. only six 284 users had some experience with ELNs. Three out of the six users found the ELN is changing the 285 way of personal documentation positively, while the others didn't answered the question or gave 286 a neutral answer. Thus we didn't reported these results as not representative. 287 It should also be mentioned that this survey reflects the situation of this specific PPP project. The 288 results cannot be easily transferred to other projects. It would be of interest if the same survey 289 would give different or the same results i) either in other projects ii) during the time course of this 290 project. 291 A summary of the most important results of the survey is presented in Table 5 below.
292 Table 5: Overview about most important results from survey, which was sent out to 77 users from 18 academic and SME 293 organisations. A set of 58 comprehensively answered questionnaires was considered for evaluation,

Major results from survey
 Most user never used an ELN before (51 out of 58 users replying to the survey stated "I never used an ELN before this project"; no info from remaining 19 invited users)  Most users (76%) are using a paper notebook in addition to the ELN  Many users (n=23) would not recommend using an ELN again  No of Operating systems: Linux=7, Mac OS=14, Windows=37  ELN typically used o Rarely or sometimes with < 1 h per session (53%) o Frequently < 1 h per session (16%) o Sometimes or frequently 1-2 h per session (9%)  Frequent users (n=13) realized an increase in quality of documentation (46%) and 38% would recommend this software to colleagues while even three out the 13 frequent users would not use an ELN again if they could decide  Rarely users (n=19) are skeptical about ELN functionality (42%)  While 52% of the Mac and Linux users are satisfied about the performance 35% of the Windows users are unhappy compared to 27% which are satisfied about the performance of the system  Helpdesk support in general is O.K. (36%), but some users (10%) seem to be not satisfied, especially with training (47%)  Most users demand higher speed (n=15) and/or better user interface (n=14) 294 Despite the perceived advantages of an ELN compared to traditional paper-based LNs as described 295 above, users encountered several drawbacks during their usage of this ELN. Users criticized the 296 provision of templates and cloned experiments, which were considered to impede the accurate 297 documentation of procedural deviations. The standardized documentation of experimental 298 procedures made it difficult to detect deviations or variations because they are not highlighted. 299 Careful review of the complete documentation was required in order to check for missing or 300 falsified information.
301 For many users (44 out of 58 = 76%), a paper-based LN was still the primary documentation 302 system. They established a habit of copying the documentation into the ELN only after the 303 completion of successful experiments rather than using the ELN online in real time. This extra 304 work is also a major source of dissatisfaction and could create difficult situations in case of 305 discrepancies between the paper and the electronic version when intellectual property needs to be 306 demonstrated. In these cases, failed experiments were not documented in the ELN, although 307 comprehensive documentation is available offline. For failed experiments, the effort to document 308 the information in a digital form was not considered worthwhile by the users. In other cases, usage 309 of the ELN for documentation of experimental work was hindered due to performance issues by 310 technically outdated lab equipment.  Table 6 and 313 Table 7 below. For a more detailed analysis, see Supplementary file S2. 314

Summarized result based on the answers by OS:
 Windows users mainly conduct wet lab work while Linux and Mac users perform in silico work  Windows users find the software too slow and too labor-intensive  Windows users know the functionality of the ELN better, as they are using the system more frequently  Mac OS and Linux users are more comfortable with the speed of the software, but they would not use or recommend it again. This may be related to the specific in silico work which might not be supported by the ELN sufficiently 316 317

Summarized results based on frequency of usage:
 In silico users enter into the ELN less frequently, which is not unexpected as computational experiments generally run for a longer period of time than wet lab experiments  More frequent users operate the ELN online during their lab work  Frequent users would like to have higher performance (this might be related to Windows)  Better quality documentation was associated with more frequent use  Frequent users are not disrupted by documenting their work in the ELN, they like the software and would use an ELN in future  Frequent users of an ELN obtain a positive effect on the way documentation is prepared  More frequent users like the software and feel comfortable about using the software while Infrequent users find the ELN complex and are frustrated about functionality  Infrequent users are disappointed about the quality of search results 319 Conclusions from the survey 320 About 40% of the users did not find the selected solution appropriate for their specific 321 requirements. Either the solution did not support specific data sets or experiment types, or the 322 solution did not respond fast enough to be used adequately. This indicates that the solution was 323 not fit for purpose. More individual user demands would have to be considered to improve the 324 outcome. This would require more resources in time and manpower than can be accommodated in 325 a publicly funded project. Time would be a key factor as experimental work begins within 6 326 months after project kick-off and the documentation process of experiments starts parallel the 327 experimental initiation. Keeping in mind that every user needs training time to get acquainted to a 328 new system and there are always initial 'pitfalls' to any newly introduced system, an electronic 329 laboratory notebook must be available within 4-5 months after project start. About one month 330 should be allocated for the vendor negotiation process. Another month or two are required for 331 writing and launching the tender process. This reduces the time frame for a systematic user 332 requirement evaluation process to less than 1 month after kick-off meeting. It should be 333 acknowledged that not all types of experiments will be fully defined nor will all users be identified. 334 Thus the selection process will always be based on assumptions as described above.
335 The slow response of the selected system might have occurred due to many potential issues. It 336 could be related to the bandwidth available at the location, but more frequently we believe this is 337 based on the hardware available. We tested the ELN on modern hardware with low and high 338 bandwidth (Windows: Core i7 CPU @ 2.6 GHz, 6 GB RAM tested on ADSL: 25 kbit/s download, 339 5 kbit/s upload and iMac Core i5 CPU @ 3.2 GHz, 4 MB RAM tested on ADSL: 2 kbit/s download 340 and 400 bit/s upload) without a major impact on performance, but we did not test old hardware. 341 During this project we learned that in certain labs computers run Windows XP, MS Office 2003 342 and Internet Explorer 8. Using outdated software and hardware can be a contributor for the slow 343 response of the ELN. Another potential issue could arise from uploading huge data sets on slow 344 Asymmetric Digital Subscriber Lines (ADSL). Users working on local file servers and 345 downloading data from Internet face unexpected low performance when uploading data to a web 346 resource using ADSL, which is due to low upload bandwidth, respectively high latency. This is 347 true for all centralized server infrastructures accessed by Internet lines including SaaS and should 348 be reflected when considering hosting in the cloud. 349 Finally, users demand similar functionalities as on their daily working platform. This is an 350 unsolved challenge due to the heterogeneity of software used in life sciences; from interactive 351 graphical user interface (GUI) based office packages to highly sophisticated batch processing 352 systems. The evolution of new ELNs should provide more closely aligned capabilities to meet the 353 users' requirements.
354 For the ongoing PPP project, a more individualized user support capability may have helped to 355 overcome some of the issues mentioned above. Individual on-site training parallel to the 356 experimental work could offer insight to users' issues and provide advice for solutions or 357 workarounds. This activity would require either additional travel for a small group of super users 358 or the creation of a larger, widely spread group of well-trained super users which is highly 359 informed about ongoing issues and solutions.
360 The issues discussed above also constitute a social or a scientific community problem. Being the 361 first, which often is considered as being the best, is the dictum scientists strive to achieve especially 362 when performance is reduced to the number of publications and frequency of citation, not to the 363 quality of documentation/reproducibility accounting for determination of quality of results. This 364 culture needs to be replaced by "presenting full sets of high quality results including all metadata" 402 overall process could be faster once the user gets familiar with the system. The opportunity to lose 403 any necessary information would be lowered by using the system online during the practical work. 404 There are also other advantages in the ELN, e.g. linking experiments to regular protocols. 426 Scientists often display a strong unwillingness to share their data. They often believe they are the 427 data owner, i.e. the entity that can authorize or deny access to the data. Nevertheless, they are 428 responsible for data accuracy, integrity and completeness as the representative of the data owner. 429 The data generator should be granted primary use (i.e. publication) of the data (DFG, 2013), but 430 the true owner is the organization that financially supports the project.
431 Within a PPP project it is necessary to establish a documentation policy that is suitable for all. An 432 agreement must occur on standards for the responsibility, content and mechanisms of 433 documentation, particularly in international collaborations where country-specific and cultural 434 differences need to be addressed (Elliott, What are the benefits of ELN?, 2010). Furthermore, no 435 official, widely accepted standards are pre-defined. As long as the justification for ELNs is for 436 more control over performance than to foster willingness to cooperate and share data and 437 knowledge in the early phase of experiments, user acceptance will remain low (Myers, 2014).