-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add capability to process DNADerivedData to occurrences
module
#98
Comments
DNADerivedData is a Darwin Core extension in the same way MeasurementOrFacts is, so I think they should be treated the same. I agree that occurrences.search is already too heavy. Let's come at the problem from the other direction and think about the ideal user interface without technical considerations. This is what felt intuitive to me: occurrences = pyobis.search.occurrences(
scientificname='Mola mola',
required_extensions=["MoF","DNA"]
) However, we have already decided to separate the query-building from the query-executing, and I think that was good choice. Maybe we could aim something like this instead: occurrences = pyobis.OccQuery(
scientificname='Mola mola',
required_extensions=["MoF","DNA"]
).execute() This is nice because it positions the code to potentially generalize to handle other DwC providers as well: occurrences = pyobis.OccQuery(
scientificname='Mola mola',
required_extensions=["MoF","DNA"]
).execute(
data_providers=["OBIS", "GBIF"]
) So, with this in mind, I am thinking we begin adding new features to the occurrences = pyobis.OccQuery(
required_extensions=["DNA"]
).search(
scientificname='Mola mola',
MeasurementOrFact=True
) Thoughts? |
I agree that the ways you listed are much intuitive and user-friendly than the current project. But I had two questions to ask:
|
I think the object-oriented response to this is that we have multiple subclasses of
Both have shared attributes and methods under the parent class
The object-oriented approach can be a bit pedantic, but it would work here. However, our focus should remain on keeping the intuitiveness of the user-interface first.. Is the object-oriented approach starting to over-complicate the UI? Maybe a good compromise is to use OO programming under the hood and provide convenience methods to wrap that complexity for casual users. By this I mean little functions like this example: # this function available as pyobis.occurrences.search()
define search(scientific_name):
return pyobis.OccSearchQuery(
scientificname=scientific_name
).execute() |
This is interesting, building on your suggestion I think we might need to refactor how we handle functions and classes in each module. I ran an experiment with Sample Usage# intuitive way of doing things
from pyobis import taxa
query1 = taxa.search(scientificname="mola mola")
query1.execute()
# set query1.data to the fetched data from source, before this we only built the query not the response
query1.data
# show the data
query1.api_url
# show the api url
query1.mapper_url
# show the mapper url
query1.to_pandas()
# convert the data to a pandas dataframe Under the hood (inside the module)from ..obisutils import build_api_url, handle_arrstr, obis_baseurl, obis_GET
import pandas as pd
def search(scientificname=None, **kwargs):
"""
Get taxon records.
:param scientificname: [String,Array] One or more scientific names from the
OBIS backbone. All included and synonym taxa are included in the search
:return: A TaxaResponse object
"""
scientificname = handle_arrstr(scientificname)
url = obis_baseurl + "taxon/" + scientificname
args = {"scientificname": scientificname}
# return a taxa response class
return TaxaResponse(url, args)
class TaxaResponse():
"""
Taxa Response Class
"""
def __init__(self, url, args):
# public members
self.data = None
self.api_url = build_api_url(url, args)
self.mapper_url = None
# private members
self.__args = args
self.__url = url
def execute(self, **kwargs):
out = obis_GET(
self.__url,
self.__args,
"application/json; charset=utf-8",
**kwargs
)
self.data = out
def to_pandas(self):
return pd.DataFrame(self.data["results"]) in the def build_api_url(url, args):
return (
url
+ "?"
+ urlencode({k: v for k, v in args.items() if v is not None})
) how does this sound? |
I think this looks good, but I have a tendency to overengineer things. @ocefpaf (and|or) @MathewBiddle : what do you think? The object-oriented programming approach is familiar to me, and wrapping things with convenience methods for casual users seems reasonable to me too. But maybe there is a more elegant approach? |
Looks good to me. The user interface is not too complex and the internals are easy to follow. +1 |
Overview
The
pyobis
package in its current state does not support fetchingDNADerivedData
using theoccurrences
module.I believe there can be two ways to implement it:
search
function likeMeasurementOrFacts
dna( self, **args, **kwargs )
and enable necessary processing and utility methods sincesearch
function is already heavy with so many features.The text was updated successfully, but these errors were encountered: