New information emerging in the last week in California paints a very different picture of the spread of the novel coronavirus than the one suggested by the first, official version. Postmortem testing indicates that two Santa Clara County residents who died in their homes in early to mid-February were infected with the novel coronavirus that now has killed more than 1, Californians, the county medical examiner announced Tuesday.

Evidence is building that its origin as a zoonotic spillover occurred prior to the officially accepted timing of early December, We show that six countries had exceptionally early cases, unlikely to represent part of their main case series. Origination dates are discussed for the first five countries outside China and each continent.

This suggests an earlier and more rapid timeline of spread.

Our study provides new approaches for estimating dates of the arrival of infectious diseases based on small samples that can be applied to many epidemiological situations. Evidence is building that its origin as a zoonotic spillover occurred before the officially accepted timing of early December, We date the origin of COVID cases from countries and territories using a model from conservation science. We use a method that was originally developed to date the timing of extinction, and turn it to date the timing of origination using case dates rather than sighting events.

Our suggest that the virus emerged in China in early October to mid-November, the most likely date being November 17and by January,had spread globally. This suggests a much earlier and more rapid spread than is evident from confirmed cases.

In addition, our study provides a new approach for estimating dates of the arrival of infectious diseases in new areas that can be applied to many different situations in the future. PLoS Pathog 17 6 : e This is an open access article distributed under the terms of the Creative Commons Attributionwhich permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: I. The funders had no role in study de, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. While an origin as a zoonotic spillover in the Huanan Seafood Market, Wuhan, sometime during early December,has been proposed [ 1 ], this has been called into question [ 2 — 4 ].

This uncertainty arises due to both the presence of earlier potential COVID cases, and the fact that most phylogenetic analyses put the most recent ancestor at between mid-November and early December, [ 5 ]. Uncertainty around origination dates extends beyond the suggested zoonotic overspill in China to all countries where SARS-CoV-2 has spread.

For example, in France the first case of COVID was recorded as January 25,however a recent retrospective review of medical records from patients in intensive care unit ICU with both influenza-like illness ILI symptoms and pulmonary ground-glass opacity admitted between December 2,and January 16,14 patients of 58 identified one patient as having COVID who had been presented to the emergency ward on December 27 [ 6 ].

Here we repurpose extinction models from conservation science to estimate the potential for earlier cases than has been reported of COVID in countries and territories. As such we specifically date the origin of cases that resulted in the virus taking hold in each country.

Within the discipline of conservation science, a of models have been developed to infer or date extinction events based on a series of sightings of a species. Interest lies in determining whether a species still persists, having not been sighted for a period of time.

If it is assumed the species is extinct, interest then lies in determining when extinction occurred. The application of these models has been proposed in a of areas beyond extinction modelling to determine end points, particularly the Optimal Linear Estimation OLE method developed by Roberts and Solow [ 8 ], including geological stratigraphy [ 9 ], archaeology [ 10 ], phenological studies [ 11 ], and phylogenetics [ 12 ]. Such a knowledge is critical for our understanding of the spread of this disease.

Such case dates were removed from the dataset, while maintaining k at between 5 and 10 depending on the of available case dates.

By removing the earliest case date from the record for Yemen the of case dates fell below five and therefore Yemen was not analysed. Using the OLE, origination dates were 4. Vertical dashed lines represent mean values. Map layers were created using the R package rworldmap, Version 1. Additional notable are the estimated dates within Europe. However, it should be noted that Italy was one of the six countries with exceptionally early cases and therefore the result for Italy was affected by the removal of this early case i.

Nsoesie et al. The recent t WHO-China study on the global origin of SARS-CoV-2 found that, based on a review of molecular evidence, most point estimates place the most recent ancestor at between mid-November and early December, with a range from late September to early December [ 5 ].

Our support the existing evidence and suggest that the first case of COVID would have been sometime between early October and mid-November. Further, our suggest the most likely timing of the first case to be November 17, This is only 1 day after a case identified in a traveller to Thailand from Wuhan on January 8, [ 1516 ].

Similarly, from an analysis of 40 composite influent wastewater samples from northern Italy, La Rosa et al. Further analysis of retrospective testing studies will help validate the application of OLE and associated methods. Using the method of Solow and Smith [ 14 ], we identified six countries with exceptionally early cases of COVID compared with the rest of the case time series for those countries. These may represent isolated cases, infections that did not contribute to the eventual spread of COVID through the country or territory.

However, currently only the of retrospective testing have been published for Italy as described above. Without such analyses it is not possible to determine if our have in fact identified early isolated cases or simply reflect poor surveillance and pre-symptomatic transmission.

In the same way the extinction events are rarely observed, so too are origination events such as those of COVID Without rigorous tracing systems, dating the first cases has to be inferred. In the case of emerging infectious diseases, this is most frequently based on phylogenetic analysis. For this to be meaningful, it requires sufficient sampling and diversity. Here we applied a well-established extinction estimator i.

As the method can be effectively applied to very sparse datasets, with as few as 4—5 records [ 1819 ], it illustrates the potential to rapidly gain an understanding of the origination timings of novel zoonotic diseases when they are poorly known. Moreover, some of the approaches from this group of methods can be applied even to records with just two [ 20 ] or even a single record [ 21 ].

Using methods borrowed from conservation science, we are able to estimate a range of likely dates for the zoonotic spillover of COVID into humans in China and the subsequent spread to countries around the world.

As the dataset of cases in China does not extend to the first verified cases, we used the dataset presented by Huang et al. From these datasets we created time series of new cases for each country. While the datasets present the of cases per day, it is not possible to determine whether these cases are independent or related. We therefore used the case days rather than individual cases i. It is important to note that case days represent the time when cases were reported, and not the time of transmission.

There are a of exceptionally early cases in specific countries that may have arisen for a of reasons e.

These exceptionally early cases propagate uncertainty in origination estimators and therefore we applied a method proposed by Solow and Smith [ 14 ] to identify such cases. In the context of COVID, this method asks the question, given an early case, what is the probability it belongs to the main body of cases? This method has been ly used in conservation science to determine whether new sightings of the European polecat Mustela putorius in Scotland arose from the native population that was thought to be extirpated or arose from surreptitious reintroduction [ 20 ].

Here we use this method to identify cases of COVID that appear not to have taken hold within a country. The basic assumption of Solow and Smith [ 14 ] method is that these represent the k largest values of a larger collection of values generated from a distribution from the Gumbel domain of attraction. Suppose that an earlier case of COVID is recorded at time yinterest centres on assessing the exceptionality of the earlier record. We applied this test using the first 5 to 10 k earliest case dates of COVID depending on the length of the case record for each country.

OLE uses the time series of last known chronological occurrences of the studied phenomenon to estimate the time after the last known occurrence when the process that was generating them has stopped, and the phenomenon will consequently no longer be observable.

However, in our case we are interested in the timing of origination rather than extinction, so we apply it here with the reverse temporal direction [ 10 ]. The OLE method has proved to be robust in the inference of extinction under a variety of scenarios, reporting probabilities and trends [ 1824 ].

It is important to note that underlying assumptions of the OLE are not specific to biological organisms and the species extinction process, and that the method does not contain any biologically specific parameters. OLE simply takes into intervals between occurrences of a phenomenon and their distribution, irrespective of the type of phenomenon studied. This makes it readily applicable to diverse types of phenomena, as long as they are characterized by sporadic records made before the phenomenon or the process ceased [ 10 ].

It also does not require a complete record, but it s for records being generated based on some unknown probability. OLE has been shown to perform well under different rates and trends in sighting effort [ 1824 ], which in this case corresponds to reporting probability. Furthermore, OLE is a non-parametric method and it does not make any assumptions about the sighting rates or data distribution, making it more flexible compared to other methods [ 1925 ]. Finally, OLE is based on extreme value theory, which shows that the distribution of the maximum is well approximated by the generalised extreme value distribution, regardless of the actual distribution of records [ 192526 ].

Also, is an estimate of the shape parameter of the t Weibull distribution of the k earliest case date times. Having excluded exceptionally early cases using the method of Solow and Smith [ 14 ], as they likely represent cases where COVID has failed to take hold, we used the first 5 to 10 k earliest confirmed case dates for each country as suggested by Solow [ 19 ] and Rivadeneira et al. However, as there is no specific start date as it varies depending on the arrival time of COVID in each country, the 10th case date is used as the end of the period.

The origination date was calculated using the R software package sExtinct [ 27 ].