Wrapper function to get clean

This function retrieves and cleans the data for the experiment and survey. It uses several helper functions to filter and format the data, including filter_random_accuracy_ids(), filter_manually_identified_ids(), filter_suspicious_rt_ids(), factor_categories(), factor_groups(), factor_chr_vars(), factor_strategies(), and compute_nieq_scores(). The cleaned data is returned as a list containing two data frames: df_expe and df_survey. The df_expe data frame contains the cleaned experiment data, while the df_survey data frame contains the cleaned survey data.

Usage

get_clean_data(
  n_groups = 2,
  exclude_no_vviq = TRUE,
  exclude_no_osivq = TRUE,
  exclude_no_raven = TRUE,
  exclude_cheated = TRUE,
  exclude_distracted = TRUE,
  exclude_treatment = FALSE,
  exclude_adhd = FALSE,
  exclude_asd = FALSE,
  exclude_dyslexia = FALSE,
  exclude_other = FALSE,
  sd_mult = 2.25,
  verbose = FALSE
)

Arguments

n_groups: The number of groups to factor in the data. Must be 2, 3 or 4. 2 divides the sample into Aphants and Typical imagers using the 32 VVIQ criterio, 3 divides the sample into Aphants (VVIQ = 16), Hypophants (VVIQ < 32) and Typical imagers, and 4 also isolates Hyperphants with VVIQ > 75.
exclude_no_vviq: Logical, whether to exclude participants without VVIQ.
exclude_no_osivq: Logical, whether to exclude participants without OSIVQ.
exclude_no_raven: Logical, whether to exclude participants without Raven.
exclude_cheated: Logical, whether to exclude participants who have cheated (based on self-report).
exclude_distracted: Logical, whether to exclude participants who have been distracted (based on self-report).
exclude_treatment: Logical, whether to exclude participants who have a treatment for a neurological or psychiatric disorder.
exclude_adhd: Logical, whether to exclude participants who have ADHD.
exclude_asd: Logical, whether to exclude participants who have ASD.
exclude_dyslexia: Logical, whether to exclude participants who have dyslexia.
exclude_other: Logical, whether to exclude participants who have other neurological troubles.
sd_mult: A numeric value indicating how many standard deviations to use for identifying suspicious median RTs. The default is 2.25, which means that median RTs that are more than 2.25 standard deviations inferior to the mean are considered suspiciously fast and potential "spamming".
verbose: A logical value indicating whether to print verbose messages about the filtering process. Default is FALSE.

Value

A list containing two data frames:

df_expe: The cleaned experiment data.
df_survey: The cleaned survey data.

Examples

clean_data <- get_clean_data(verbose = TRUE)
#> 
#> Sample size before accuracy analysis: 137
#> Participants below random accuracy (<= 50%): 8 (5.84%)
#> 
#> Sample size before manual examination: 137
#> Manually identified participants:
#> - N without VVIQ: 3 -> Excluded
#> - N without OSIVQ: 6 -> Excluded
#> - N without Raven: 2 -> Excluded
#> - N who cheated: 3 -> Excluded
#> - N who were distracted: 12 -> Excluded
#> - N who had treatment: 4 -> Included
#> - N with ADHD: 7 -> Included
#> - N with ASD: 5 -> Included
#> - N with dyslexia: 2 -> Included
#> - N with other neuro troubles: 2 -> Included
#> Participants to exclude: 24 (17.52%)
#> 
#> Sample size before median RTs analysis: 106
#> Participants with median RTs outside 2.25 SDs: 2 (1.89%)
head(clean_data$df_expe)
#> # A tibble: 6 × 19
#>   id     language group group_2 group_3 expe_phase trial_number problem category
#>   <fct>  <fct>    <fct> <fct>   <fct>   <fct>             <int>   <int> <fct>   
#> 1 acdn2… fr       Typi… Typical Typical expe_bloc…            1      18 Spatial 
#> 2 acdn2… fr       Typi… Typical Typical expe_bloc…            2      25 Control 
#> 3 acdn2… fr       Typi… Typical Typical expe_bloc…            3       2 Visual  
#> 4 acdn2… fr       Typi… Typical Typical expe_bloc…            4      19 Control 
#> 5 acdn2… fr       Typi… Typical Typical expe_bloc…            5       1 Visual  
#> 6 acdn2… fr       Typi… Typical Typical expe_bloc…            6      10 Spatial 
#> # ℹ 10 more variables: premise_1_rt <dbl>, premise_2_rt <dbl>,
#> #   premise_3_rt <dbl>, conclusion_rt <dbl>, rt_total <dbl>, response <fct>,
#> #   correct_response <fct>, accuracy <int>, acc_perc <dbl>, median_rt <dbl>
head(clean_data$df_survey)
#> # A tibble: 6 × 112
#>   id         language   age gender group group_2 group_3 country language_native
#>   <fct>      <fct>    <int> <fct>  <fct> <fct>   <fct>   <fct>   <fct>          
#> 1 acdn24772… fr          24 f      Typi… Typical Typical fra     fr             
#> 2 ahos20623… fr          26 f      Apha… Aphant… Aphant… fra     fr             
#> 3 anoo20152… fr          23 m      Typi… Typical Typical fra     fr             
#> 4 arje91119… fr          26 f      Typi… Typical Typical fra     fr             
#> 5 auzb74885… fr          25 f      Typi… Typical Typical fra     fr             
#> 6 azcj31777… fr          28 m      Hypo… Aphant… Hypoph… fra     fr             
#> # ℹ 103 more variables: language_usual <fct>, job <fct>, education <fct>,
#> #   field <fct>, vviq_is_complete <lgl>, vviq_total_score <int>,
#> #   vviq_q01 <int>, vviq_q02 <int>, vviq_q03 <int>, vviq_q04 <int>,
#> #   vviq_q05 <int>, vviq_q06 <int>, vviq_q07 <int>, vviq_q08 <int>,
#> #   vviq_q09 <int>, vviq_q10 <int>, vviq_q11 <int>, vviq_q12 <int>,
#> #   vviq_q13 <int>, vviq_q14 <int>, vviq_q15 <int>, vviq_q16 <int>,
#> #   osivq_is_complete <lgl>, osivq_object <dbl>, osivq_spatial <dbl>, …

Wrapper function to get clean "analysis-ready" data

Usage

Arguments

Value

Examples