Wednesday, April 26, 2017

Exercise: A Sequence Signature for Transcription-Translation Coupling in Bacteria?

A pretty common question over on Quora is something along the lines of "how do I learn bioinformatics".  Great question!  Tonight I'm going to outline a project which I think would make a good first bioinformatics project.  It is rich in content and keys off an interesting new non-computational result.  And since I've left graffiti on multiple Quora threads that I would write something like this in the immediate future, here it is!

Saturday, April 22, 2017

Pinniped Karyotypes & N50 Statistics

In my recent piece on long read assembly, I laid out part of the case against the N50 statistic.  Historically, the issues with the statistic have been around the fact it can be gamed at the expense of assembly correctness or assembly coverage. These are concerns for the typical sort of short read assemblies we've grown used to: lots of contigs and the temptation (perhaps justified) to try to go for higher N50s by more aggressive merging or by filtering out the short contigs.  Elin Videvall over at The Molecular Ecologist has a nice ongoing series of posts illustrating the statistic and these commonplace issues:
I'm going to come at the problem from the other end, as a new preprint from 10x Genomics illustrates the problem of using an N50 statistic (or any related Nxx statistic) with good long-read / linked read assemblies -- but doesn't demonstrate this point quite as strongly as I thought when I first started drafting this.

Thursday, April 20, 2017

Time to Retire HeLa?

A TV movie produced by and starring American culture mogul Oprah Winfrey is about to hit screens which dramatizes Rebecca Skloot's The Immortal Life of Henrietta Lacks.  If you haven't read this remarkable book, you really should.  It should certainly be required reading for anyone entering biomedical fields.  That's not to claim it is perfect; one of Lacks' sons has objected to the way his family is portrayed.  But it is a searing human story of how the most famous cell line in the world came to be.  Even if you excuse some of the injustices done as compatible with then contemporary ethical standards, it is a thought-provoking piece on the topic of what our biomedical ethics should be.

Thursday, April 13, 2017

Alexandria Jumps Into Shuttle Business

A restaurant I frequented during my grad school days had a map on the wall showing Boston area transit routes from roughly the 1940s.  Remarkably, most of those streetcar routes are found largely unchanged in the MBTA's current bus routes.  Yes, routes have been altered to account for expansion of the Red Line and shifting of the Orange Line, but most of the routes are little changed and very, very few new ones have been added.  Some of that reflects the canalization of routes by the street patterns; there are only so many large streets suitable for buses and Somerville's hills and the various rivers impose further constraints.  Much of it lies in the always tight purses at the T and the political difficulty of ever closing an old route to enable moving resources to a new one.  Unfortunately, the commuting patterns in Boston are not conserved from the 1940s, with far more workers commuting from distant suburbs and dense developments springing up.

Monday, April 10, 2017

10x Launches Mass T-Cell Receptor Decoding

Adaptive immunity is an endlessly fascinating topic which I have not explored very deeply, which is particularly unfortunate given the many parallels to computing.  Combinatorial logic is used to construct a vast array of possible antigen readers, expression logic ensures that only one such reader is expressed in a given cell and hypermutation and evolution are used to optimize these readers to match specific antigens.  All this not only creates weapons to deploy against foreign invaders, but also a memory which effectively records an individual's history of environmental exposures.  Just before I started writing this two tweets highlighted using adaptive immunity profiling to reveal exposure to tuberculosis and cytomegalovirus.  Adaptive immunity is responsible for transplant rejection, with new companies looking to more selectively modulate immunity to enable transplants without shutting the immune system down.  Adaptive immunity also ties into the white hot field of immunotherapy for oncology, exploring whether differences in antigen response underlay variation in immunotherapy success.  To enable profiling adaptive immunity on a mass scale, 10x Genomics has now introduced a single-cell kit for targeted profiling of T-cell receptor variable regions.

Tuesday, April 04, 2017

SageHLS: Automated uHMW DNA Preparation

Advances in optical mapping, linked reads, PacBio and nanopore sequencing are enabling generating highly contiguous large genome sequences routinely and inexpensively.  However, this in turn is creating intense demand for efficiently and reliably preparing ultra-high molecular weight (uHMW) DNA.  By this term,  I mean DNA approaching or exceeding a megabase in size.  Methods for preparing HMW and uHMW DNA tend to be very old-school, reaching back at least back to the 1970s, 80s and 90s for approaches used in the early days.  Phenol-chloroform preps with the DNA spooled out onto a glass hook or rod are one popular approach; another is to embed cells in agarose blocks, extract the DNA within the block and then degrade the agarose to retrieve the DNA.  Nuclei preps are yet another approach. Any liquid handling must be performed gently and with wide bore pipettes.  These techniques tend to be tedious and slow affairs, requiring many manual steps.  As an alternative, Sage Sciences has launched an instrument which automates a process with no hazardous chemicals, the SageHLS.

Thursday, March 30, 2017

Chromosome-Scale Scaffolds And The State of Genome Assembly

A new paper on using Hi-C sequencing appeared in Science recently, demonstrating the generation of chromosome-length scaffolds for human as well as several insect genomes.  The authors even provide a cost model, proposing that by processing multiple genomes in parallel the sequencing reagent cost (but not labor) of this approach should be about $10K per human genome. In the case of the insect genomes, the paper enables a look at chromosome evolution which is simply impossible with lower resolution.  These findings resonate with a number of pieces I've written over the years, but particularly with my recent criticism of the proposal Earth BioGenome project and a spirited defense of that concept made in the comments of my piece by a member of the steering committee.

Monday, March 27, 2017

Differential Mammalian Toxicity: Why Do Some Human Foods Kill Dogs?

I've been contemplating this post for a while, but it can be seen as another angle on my recent post on the challenges of drug discovery, so it finally left the mental queue.  We often use other mammalian species in drug development to predict human toxicity.  We know animals aren't the same as people, but lacking a better alternative that's what we do.  Now, as regular readers know I keep company with a dog, and that sometimes has me wondering: how well do we understand the cases of things we can eat but which are dangerous for our canines?

Saturday, March 25, 2017

Targets: Drugability Revisited

My correspondent @datarade shot a tweet my way on his quest to understand drug discovery. He does this despite the fact I've promised posts on previous tweets that are submerged in my mental queue.  But the best part of teaching is forcing yourself to rethink what you think you know, so I'm going to actually take this one on in the space of "what is a target, how do we pick them and how do we drug them".  Which I've found to be enlightening and frustrating.  It's a messy space because so much is empirical, and I keep devising and then discarding taxonomies and explanatory approaches because they all seem unsatisfactory.

Tuesday, March 21, 2017

Obviousness: Rarely Obvious

Pacific Biosciences has made new thrusts in their ongoing intellectual property action against Oxford Nanopore, adding two recently issued patents to the fray.  Oxford has publicly brushed these off as "another pore excuse for a lawsuit", but certainly the battle is not over.  One of these patents, 9,542,527 "Compositions and methods for nucleic acid sequencing", appears to concern using hairpin linkages to read both strands, much like the 9,404,146 "Compositions and methods for nucleic acid sequencing"  patent that PacBio led with.  Since Oxford has announced they will abandon their "2D" methods that use such hairpins, this angle would seem to be soon irrelevant (as I predicted back when PacBio originally attacked).  But the other, US 9,546,400 "Nanopore sequencing using n-mers" covers basecalling methods, which is a new twist.  A route to challenge any patent is to identify "prior art", information which was publicly available at the time of the patent filing which impinges on the claims in the patent application.  Not only can exact matches to prior art be an issue, but also anything which would be "obvious" to a skilled practitioner.  And that can certainly be a can of worms

Monday, March 20, 2017

plexWell: Illumina Libraries by the Plateload

The advent of so-called next generation sequencers, particularly those from Illumina, have brought the price of sequence data down dramatically.  However, there is a catch: the cost of preparing DNA to go into the sequencer, the process known as library preparation, has glided downwards on a much shallower trajectory.  This means that for projects wishing to sequence very large numbers of small genomes or large constructs the cost of library preparation can be similar to or even exceed the cost of data generation.  A small company north of Boston called seqWell Inc has a new approach to Illumina library generation which they are on the cusp of making widely available, and not only does this bring the cost per well down but it is designed to yield normalized libraries from relatively unnormalized samples.

Tuesday, March 14, 2017

ONT Updates: GridION X5, PromethION, 1D^2, Scrappie, FPGAs and More

Clive Brown gave a webcast today with updates on a number of Oxford Nanopore topics, but clearly the flagship announcement was a new instrument, GridION X5.  Due to the raging snowstorm in the Boston area I was home with my teammate and we've been doggedly going through the tweets (now storified) and my notes (plus David Eccles' nice set) to retrieve the juiciest bones therein.

Wednesday, March 08, 2017

MinION Leviathan Reads: An Update

Last week I posted a piece on some amazing new nanopore data, only to be red-faced to discover the next morning that I had misread the axes.  So I re-posted the piece with the offending data and subsequent analysis in strike-thru font.  After I did that, I was informed that the same dataset actually did have leviathan reads, bigger than my misinterpretation.

Thursday, March 02, 2017

Catching Up On Oxford Nanopore News: More, Better, Meth & Huge

Oxford Nanopore and its collaborators have shown at least three interesting advances in the last few months which I haven't yet covered; the most astounding of which was announced this week.  I'll take these three in an order which works logically for me, though it isn't strictly chronological plus I'll touch on some parts of their platform which have not made advances which were perhaps expected.

(Morning after: Ugh, ugh, ugh -- I misread an axis, inserting an extra 0 -- so major crossouts in one section; why I shouldn't post late at night during pauses in day job stuff)

Tuesday, February 28, 2017

Earth BioGenome Project: Ill-Conceived Megaproject Du Jour

There's been a bit of buzz recently about an unfunded proposal to ultimately sequence every living species on Earth, warming up by sequencing every eukaryotic species, with a targeted cost of $4.8B.  It pains me a bit to write this, but I'm with those who think this is not a wise way to spend money and certainly not likely to work for anywhere near that budget.

Friday, February 17, 2017

#AGBT17 Tweet Archive is Up!

I've used my scheme for collecting and organizing tweets to capture most of the feed from this week's AGBT17 conference.  I still need to pore over these in detail, so I won't try to distill out much thoughts (other than single-cell sequencing is clearly in exponential growth phase!).

Monday, February 13, 2017

Bagging Novel Enzymes Via Mass Spec Metabolomics

Obtaining a complete genome sequence for a bacterium or archean is essentially a solved problem, if you can culture the bug.  Grow up biomass, purify the DNA and then use PacBio alone or a combination of long reads (PacBio or Oxford Nanopore) and short reads.  These should yield a closed genome with a very low error rate.  A few bugs spit at you by repeated failing PacBio sequencing or having some monster prophage or other repeat that is longer than the read lengths, but these are very rare.  With advances in metagenomics techniques, the solving of uncultured genomes is becoming increasingly easy and many of these remarks also apply to fungi and other eukaryotic microorganisms. Once you have the sequence, then the lack of introns in bacteria and archea makes gene prediction almost trivial, and you now have a parts list for the organism.  But is that a useful parts list?  A new paper in Nature Methods makes some progress in improving the utility of those parts lists, though we are still far from actually fully understanding an organism given its genome.

Thursday, February 02, 2017

Could Hermione Tackle MinION Yield Variability?

A bit of a foray into Oxford Nanopore land again.  By replacing a bench bumbler with someone competent, we've seen some success with our MinION at Starbase.  Highly variable yields though.  I've done some looking and discovered this isn't a unique experience.  And now Oxford is suggesting that software upgrades alone will give MinION about another 50% boost in yield; it will be interesting to see what this does for variability.  Finally, I have a notion of some of the sources of variability and an idea for a troubleshooting tool

Wednesday, February 01, 2017

Illumina Drops NeoPrep

At the 2015 AGBT meeting, Illumina launched the NeoPrep, a ~$40K instrument to automate the preparation of up to 16 sequencing libraries at a time, using a technology called electrowetting microfludics. Now news comes that Illumina is dropping the NeoPrep, halting sales immediately and allowing existing users about a year of reagents.  What happened and how does it impact genomics?

Tuesday, January 31, 2017

On The International Nature of American Biotech

I'll spend two hours in project meetings tomorrow. Around the table will be a group of scientists who are all at the top of the game and among the best in the world at what they do. We will be trying to push forward new antibiotics to save lives. Yes, we are also trying to be rewarded monetarily with it, but we all share a mission to improve humanity by finding new drugs for important medical needs.

Friday, January 27, 2017

Perl: The Bad Habit I Can't Quite Kick

TULIP is a new assembler for long, error-rich reads such as from nanopore. I was a bit stunned to see that TULIP is written in Perl; I was starting to wonder how many holdouts like me there were. Which led to this exchange on Twitter

Tuesday, January 24, 2017

Notes on a Conversation with 10X

I've been remiss in writing up a piece on 10X Genomics based on a phone discussion last week with Michael Schnall-Levin (VP Computational Biology and Applications) and Anup Parikh (Director, Product Marketing).  I always appreciate companies reaching out to me and spending time to educate me on their products and plans, and this was a very interesting and enjoyable conversation.

Saturday, January 21, 2017

Gen9 Vanishes

Earlier this week one of my colleagues had gotten a somewhat ominous email  from the CEO of Gen9 titled "Special Gen9 Announcement", which led off by saying that their holiday shutdown would be followed with a "corporate restructuring period" during which "Gen9 will not be accepting orders". The next day came an article from Scott Kirsner detailing the effective shutdown of Gen9 and sale of its assets to Ginkgo Bioworks for an undisclosed amount of cash and stock.  Interestingly, Kirsner reports that only 10 Gen9 employees will make the transition and that most of the Gen9 staff was laid off in mid-December.  It is surprising that no gossip of the cutbacks seemed to enter my radar, given a number of personal connections to the company (CEO Kevin Munnelly was a colleague at Millennium; several members of the Gen9 business group were ex-Codon or ex-Infinity and we had done limited business with Gen9)

Tuesday, January 17, 2017

Bio-Rad Sips Up RainDance

Monday evening brought news that Bio-Rad has further consolidated its grip on the droplet microfluidics space by acquiring RainDance Technologies for an undisclosed price.  Bio-Rad had previously acquired droplet digital PCR company QuantaLife back in October of 2011 and targeted sequencing company GnuBio in April of 2014.  While the droplet digital PCR has been marketed for many years now, the GnuBio effort had gone relatively quiet since the acquisition.  However, Bio-Rad announced the JP Morgan conference that this technology will be launched as OncoDrop late this year.

Monday, January 09, 2017

Illumina Unveils HiSeq Successor NovaSeq

At today's J.P. Morgan Healthcare Conference Illumina made a number of small announcements -- some new partnerships, Firefly on track for launch later this year, launch of the single cell workflow partnered with Bio-Rad.  Then CEO Francis deSouza dropped the big news: a new high-end sequencer architecture to ultimately replace all of the HiSeq instruments.  It sounds like an interesting evolution of the Illumina product line, but unfortunately too many headlines and tweets have focused on a distant goal of $100 human genomes.  Worse, not only did some commentators misconstrue the announcement as delivering on $100 genomes, but some also touted a sequencing speed of one hour for a genome which isn't remotely true.

Sunday, January 08, 2017

Pondering What Is Lost In Teaching Translation

I'm good at acquiring distractions, and a relatively new one is Quora.  This site allows users to ask questions which are then answered by members of the community.  I lurk in a number of fields, but have answered a few questions related to genomics and related fields of biology.  Tackling a question last night required re-learning some details I was disappointed I had forgotten.  In researching to regain that knowledge, I skimmed a number of study guides online, which leads to this post.

Saturday, January 07, 2017

#JPM17 Genomics and Synthetic Biology Companies

With the 2017 J.P. Morgan Conference in Healthcare (#JPM17) starting Monday, I and others have engaged in early reporting or speculation.  I've tried to compile a list of presenting companies in the genomics, informatics and synthetic biology tool spaces, but these were filtered quickly from a long list of presenting companies so I may have missed some -- please leave comments and I can add.  Also, some of the big conglomerates could speak on these topics but might ignore them, so no promises.  For example, Roche has their pharmaceutical CEO speaking, so we may not hear anything about the PacBio breakup or Genia lawsuit.  All times are Pacific Standard Time and are from the J.P. Morgan, though I've converted to 24-hour time (hopefully successfully!).  You may need to register with J.P. Morgan to follow the links I've provided and access the webcasts when they are  available.

Thursday, January 05, 2017

Two Pore Guys Previews Handheld Nanopore Analyte Sensor Ahead of J.P. Morgan Conference

2017 is certainly shaping up to be a big year for nanopore news.  I touched on Oxford Nanopore's very full plate in my speculation about sequencing platforms and we already know of two different legal actions which will be progressing, PacBio vs. Oxford Nanopore and University of California vs. Genia.  James Hadfield's take on possible Illumina announcements at the J.P. Morgan Conference includes an Illumina nanopore device.  That's speculation; today we had a pair of tweets from Two Pore Guys previewing their sensing device and that they will be talking more at J.P. Morgan (all videos from 2PG).

2PG Demo Video - HIV from Two Pore Guys on Vimeo.

Tuesday, January 03, 2017

University of California Cries "Thief!" on Genia Patents

As I noted in my last post, the University of California has filed suit against Genia claiming that Genia co-founder Roger Chen misappropriated intellectual property from UC Santa Cruz and the laboratory of Mark Akeson (filings include a bunch of  other well-known nanopore scientists, including David Deamer and Dan Branton).  While the filings are mostly dry, they are enlivened occasionally by such colorful language as "evasive tactics", "aided and abetted" and "stonewalled".  Goaded by Mick Watson, I've dug into the court filings and some of the patents (and obtaining those filings apparently cost me some real money, perhaps approaching $1.0e01 dollars).

Monday, January 02, 2017

Sequencing Technology Outlook, January 2017

Another year of blogging is upon us!  Since the J.P. Morgan Conference starts a week from today and then before long it's time for AGBT.  So if one is going to prognosticate, then there's no time to lose, as announcements could start flying at any time.