Identifying the right cohort from DEPOT

Prior to using this functionality, a user must be registered to use the DEPOT tool and must have created and saved a cohort with the tool. Users can find the cohorts they saved here.

In the website above with your saved cohorts, you will see a table with the following columns “NAME” corresponding to the name of the cohort you created using DEPOT and “ID” which is the unique cohort ID you will need if wanting to pull the data for just these cases using the tbportals.depot.api package.

Pulling data from an endpoint for just the cases in the DEPOT cohort of interest

# See an example of making a request for the data contained in the Biochemistry end point
REQUEST <- tidy_depot_api(path = "Biochemistry", token = TOKEN, cohortId = "PASTE cohort ID number Here")

# The JSON data from the API is returned in the content section as a data.frame
REQUEST$content

# The end point can be found in the path section
REQUEST$path

# Specific information about the httr request can be found in the response section
REQUEST$response

Filtering on the records relating to the cohort ID of interest

Given the requirements of TB portals, the following example below will be for purely fake hypothetical data that has similar structure as you would receive from the API call. Only the first few columns showing fake records with ids, relative dates, and specimen info columns are shown without corresponding lab test types. The final column for filtering on the cohort records is also shown.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(uuid)

# Structure of a hypothetical data.frame from REQUESTS$content
df <- data.frame("patient_id" = UUIDgenerate(n = 5),
                 "condition_id" = UUIDgenerate(n = 5),
                 "specimen_id" = UUIDgenerate(n = 5),
                 "observationfhir_id" = UUIDgenerate(n = 5),
                 "test_date" = sample(0:100, size = 5),
                 "specimen_collection_site" = rep("blood", 5)) %>%
  mutate(
    specimen_collection_date = test_date,
    in_requested_cohort = c("No", "Yes", "Yes", "No", "No"))


df
#>                             patient_id                         condition_id
#> 1 667305cc-295c-4915-950e-ac9cf36766a3 ebd812de-4d16-42be-bfca-b6eaebb2a314
#> 2 e9282b0a-f34f-457a-91f2-329b166f8fe2 de402457-b791-4d83-a726-a0d14a511f15
#> 3 95e9cc2a-8091-494c-8957-8bac0351f480 fce6c544-2e24-4818-abba-6057b1ead28b
#> 4 f84bfccb-82ee-4046-836e-47b9858cd91f 0ce80626-bb53-4f6d-b083-1562978d46d1
#> 5 494b671a-746a-44cd-964d-61f005e3e9c7 a3a306d7-f455-4222-9180-6e817181782f
#>                            specimen_id                   observationfhir_id
#> 1 a89df642-5f3c-4422-9a72-0a5a6da78f62 e21550e6-c59c-4252-9299-ac0c0e345719
#> 2 9b3219a0-42c3-4e80-a417-3140eb28dc97 9e4b37c6-9369-42e2-8e28-dc3542db9432
#> 3 efb73b65-0897-4e8d-bc66-88c8cfcf0783 fd18ddf4-6b83-432b-aee9-d7951314d5da
#> 4 31e7b5f1-d869-4989-8aa3-8adc69b26738 7a36cf1e-fb57-4b99-9222-a71bc15f4420
#> 5 7c9f1be3-1ca6-43f1-bf0c-d04f2d8f06d2 59e4cbb5-1ddd-40e5-8a0b-b1f6be551eb3
#>   test_date specimen_collection_site specimen_collection_date
#> 1        94                    blood                       94
#> 2        91                    blood                       91
#> 3        29                    blood                       29
#> 4        11                    blood                       11
#> 5        90                    blood                       90
#>   in_requested_cohort
#> 1                  No
#> 2                 Yes
#> 3                 Yes
#> 4                  No
#> 5                  No

To filter on records within the DEPOT cohort of interest, a user can use the in_requested_cohort column matching on records with a “Yes”.

# Filter on a hypothetical data.frame using only the records from a cohort ID of interest from the API call
df_cohort <- df %>%
  filter(in_requested_cohort == "Yes")

df_cohort
#>                             patient_id                         condition_id
#> 1 e9282b0a-f34f-457a-91f2-329b166f8fe2 de402457-b791-4d83-a726-a0d14a511f15
#> 2 95e9cc2a-8091-494c-8957-8bac0351f480 fce6c544-2e24-4818-abba-6057b1ead28b
#>                            specimen_id                   observationfhir_id
#> 1 9b3219a0-42c3-4e80-a417-3140eb28dc97 9e4b37c6-9369-42e2-8e28-dc3542db9432
#> 2 efb73b65-0897-4e8d-bc66-88c8cfcf0783 fd18ddf4-6b83-432b-aee9-d7951314d5da
#>   test_date specimen_collection_site specimen_collection_date
#> 1        91                    blood                       91
#> 2        29                    blood                       29
#>   in_requested_cohort
#> 1                 Yes
#> 2                 Yes