WorkflowV1.Rmd
The following vignette has been developed to outline the recording selection process. V1 is the first recording selection process if you want a faster way see V2.This workflow is specific to selecting acoustic recordings for the identification of landbirds in the Northwest Territories, Canada by the Canadian Wildlife Service. This can be modified for your own purposes.
Your working directory will be the location of all of the recordings you want to select from. It is important that recordings are organized in folders within this working directory. Each folder named after the location of the recordings it contains.
setwd("G:\\Deployed_ARU_Recordings")
The TrillR package uses SoX to deal with .wav file manipulation and generating spectrograms. To use many of the functions in this package your first need to direct the package to the location of the sox.exe. The latest release can be downloaded here https://sourceforge.net/projects/sox/files/sox/.
setsox.exe("C:/Users/deane/Desktop/sox-14.4.2/sox.exe")
Use the the get.wavs()
function to get all recordings and metadata. You can also get the file durations using getDuration=T
, however this can be time consuming. If you choose to get file durations then you can filter out recordings that are shorter than a required length (e.g. 180s or 3 min).
data <- get.wavs(start.date = "2017-06-01",end.date = "2017-06-30", getDuration=F, minduration = 180)
# Save file so you don't have to read files again. Especially if you got file durations.
#write.csv(data,"data.csv")
#head(data)
Location data is important if you want to calculate sun time such as sunset and sunrise for selecting recordings. The mergelocations()
function will merge this data for you and alert you of any problems.
locationdata <- read_excel("LV.xlsx")
data <- mergelocations(data,locationdata,locationname = "Location", Latitude = "Latitude", Longitude = "Longitude")
Sunrise and Sunset are both used to categorize recordings of interest for interpretation. In order to calculate these our data needs to have coordinates which we did above. For increased speed we will run this function in parallel.
data <- getSunCalcs(data, calc = c("sunrise","sunset"), doParallel=T)
Next is to put recordings into time categories for selection. The idea is that for each location there will be one randomly selected recording for each category. For our purposes lets create 8 time categories. We will divide the the recordings up into two equal halves of the month of June depending on the number of days of recordings for each location. The first half will be early (‘E’) and the second half will be late (L). Then for each early and late June recordings we will select four time periods: an hour before sunrise to an hour after sunrise, an hour after sunrise to three hours after sunrise, three hours after sunrise to five hours after sunrise, and an hour before sunset to an hour after sunset. This results in 8 categories to define:
To achive this we can generate columns for start and end dates/times that is precalculated.
data <- data %>% dplyr::group_by(location) %>% mutate(category=NA) %>%
mutate(start.date=min(JDay),end.date=ceiling(mean(c(max(JDay),min(JDay)))),start.time=as_hms(sunset-3600),end.time=as_hms(sunset+3600)) %>%
categorize("EN",start.date,end.date,start.time,end.time)%>%
mutate(start.date=ceiling(mean(c(min(JDay),max(JDay))))+1,end.date=max(JDay),start.time=as_hms(sunset-3600),end.time=as_hms(sunset+3600)) %>%
categorize("LN",start.date,end.date,start.time,end.time)%>%
mutate(start.date=min(JDay),end.date=ceiling(mean(c(max(JDay),min(JDay)))),start.time=as_hms(sunrise-3600),end.time=as_hms(sunrise+3600)) %>%
categorize("EE",start.date,end.date,start.time,end.time) %>%
mutate(start.date=ceiling(mean(c(min(JDay),max(JDay))))+1,end.date=max(JDay),start.time=as_hms(sunrise-3600),end.time=as_hms(sunrise+3600)) %>%
categorize("LE",start.date,end.date,start.time,end.time) %>%
mutate(start.date=min(JDay),end.date=ceiling(mean(c(max(JDay),min(JDay)))),start.time=as_hms(sunrise+3601),end.time=as_hms(sunrise+(3600*3))) %>%
categorize("EM",start.date,end.date,start.time,end.time) %>%
mutate(start.date=ceiling(mean(c(min(JDay),max(JDay))))+1,end.date=max(JDay),start.time=as_hms(sunrise+3601),end.time=as_hms(sunrise+(3600*3))) %>%
categorize("LM",start.date,end.date,start.time,end.time) %>%
mutate(start.date=min(JDay),end.date=ceiling(mean(c(max(JDay),min(JDay)))),start.time=as_hms(sunrise+(3600*3)+1),end.time=as_hms(sunrise+(3600*5))) %>%
categorize("EL",start.date,end.date,start.time,end.time) %>%
mutate(start.date=ceiling(mean(c(min(JDay),max(JDay))))+1,end.date=max(JDay),start.time=as_hms(sunrise+(3600*3)+1),end.time=as_hms(sunrise+(3600*5))) %>%
categorize("LL",start.date,end.date,start.time,end.time) %>% select(-start.date,-end.date,-start.time,-end.time)
#### Remove NAs for your final data to select from.
dataselection <- data[!is.na(data$category),]
Now that we have all available recordings categorized we can create spectrograms to assess weather suitability. This can be done multiple ways but it is recommended to do recording selection in rounds. We can make use of the slice_sample(n = 1)
to get a random selection of recordings to check weather suitability.
### Random sample of all eligible recordings (essentially all recordings but reorders them)
roundone <- dataselection %>% group_by(location,category) %>% slice_sample(n = 1)
sox.spectrograms(roundone, duration = list(start = 0, end = 180), doParallel = T) # 3 minute spectrogram
The sox.spectrograms()
function will output a recording selection file where you will you can choose a selection by changing the “Selected” Column to yes.
To do recording selection follow these steps:
Open your recordingselection.csv file and if needed sort by spectrogram file name
Find the first recording in the recordingselection.csv and find the corresponding spectrogram and assess it for weather suitablity. If the recording is good change “Selected” from “No” to “Yes”
Continue for all recordings and once complete read in your selection data:
roundoneresults <- read.csv("selectedrecordings.csv", stringsAsFactors = F)
finalselection <- selectiondata[selectiondata$Selected =="Yes",] ### Just get selected add to a final selection dataframe
dataselection <- remove.files(dataselection, roundoneresults)
dataselection <- remove.selection(dataselection, finalselection)
roundtwo <- dataselection %>% group_by(location,category) %>% slice_sample(n = 1)
sox.spectrograms(roundtwo, duration = list(start = 0, end = 180), doParallel = T) # 3 minute spectrogram