<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.3.3">Jekyll</generator><link href="https://nimrobotics.github.io/feed.xml" rel="self" type="application/atom+xml"/><link href="https://nimrobotics.github.io/" rel="alternate" type="text/html" hreflang="en"/><updated>2026-04-07T20:40:47+00:00</updated><id>https://nimrobotics.github.io/feed.xml</id><title type="html">nimRobotics</title><subtitle>Eploring robotics research! </subtitle><entry><title type="html">Beginner’s guide to models in R</title><link href="https://nimrobotics.github.io/blog/2025/fnirsl-lsl/" rel="alternate" type="text/html" title="Beginner’s guide to models in R"/><published>2025-11-02T05:53:22+00:00</published><updated>2025-11-02T05:53:22+00:00</updated><id>https://nimrobotics.github.io/blog/2025/fnirsl-lsl</id><content type="html" xml:base="https://nimrobotics.github.io/blog/2025/fnirsl-lsl/"><![CDATA[<p>In this exercise you will use a large dataset that describes houses that were sold in Ames, Iowa. You will use these Ames house data from the package “AmesHousing” and parallel coordinate plots to understand how houses vary by neighborhood. Then you will fit models for each neighborhood based on the size of the house to assess whether a linear model fits all neighbohoods in a similar manner. A scatter plot shows whether this is the case.</p> <p>Objectives: - Create plots for exploratory data analysis of high-dimensional data - Use abstract, aggregated model-based variables (i.e., overall fit., intercept and slope of a linear model) to compare groups of data - Appreciate how visualizations might help you identify unfair machine learning models</p> <p>Submit: Complete each section chunck of code below to process the data and create the graphs. I have given you some bits of code to do some transformations that we have not discussed in class. Briefly answer the questions posed in describing what each chunk does and the meaning of the graphs.</p> <h2 id="load-packages">Load packages</h2> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">library</span><span class="p">(</span><span class="n">AmesHousing</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">janitor</span><span class="p">)</span><span class="w"> </span><span class="c1"># Useful package for converting variable names, such as "Lot shape" to "lot_shape"</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## 
## Attaching package: 'janitor'
</code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">library</span><span class="p">(</span><span class="n">scales</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">broom</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">tidyverse</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2
</code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ readr::col_factor() masks scales::col_factor()
## ✖ purrr::discard()    masks scales::discard()
## ✖ dplyr::filter()     masks stats::filter()
## ✖ dplyr::lag()        masks stats::lag()
## ℹ Use the conflicted package (&lt;http://conflicted.r-lib.org/&gt;) to force all conflicts to become errors
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">library</span><span class="p">(</span><span class="n">ggrepel</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <h1 id="load-and-clean-the-data">Load and clean the data</h1> <p>What the clean_names function does to the names?</p> <p>It convert the variable names from camelCase to snake_case i.e. removes all spaces, convert to lower case, and use underscore to separate words. example “MS SubClass” to “ms_sub_class”</p> <p>Why is “where” useful?</p> <p>It is useful to select variables based on their type. For example, where(is.numeric) will select all numeric variables.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># names(ames_raw)</span><span class="w">
</span><span class="n">house.df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ames_raw</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="n">janitor</span><span class="o">::</span><span class="n">clean_names</span><span class="p">()</span><span class="w">
</span><span class="c1"># names(house.df)</span><span class="w">

</span><span class="n">house.df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">house.df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">select</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span><span class="w"> </span><span class="n">sale_price</span><span class="p">,</span><span class="w"> </span><span class="n">neighborhood</span><span class="p">,</span><span class="w"> </span><span class="c1"># Selects specific variables</span><span class="w">
         </span><span class="n">where</span><span class="p">(</span><span class="n">is.numeric</span><span class="p">),</span><span class="w"> </span><span class="c1"># Selects numeric variables</span><span class="w">
         </span><span class="o">-</span><span class="n">order</span><span class="p">,</span><span class="w"> </span><span class="o">-</span><span class="p">(</span><span class="n">misc_val</span><span class="o">:</span><span class="n">sale_condition</span><span class="p">),</span><span class="w"> </span><span class="o">-</span><span class="n">lot_area</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="c1"># Removes specific variables</span><span class="w">
  </span><span class="n">select</span><span class="p">(</span><span class="n">which</span><span class="p">(</span><span class="n">colMeans</span><span class="p">(</span><span class="nf">is.na</span><span class="p">(</span><span class="n">.</span><span class="p">))</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="m">0.05</span><span class="p">))</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="c1"># Removes columns with more than 5% missing values</span><span class="w">
  </span><span class="n">filter</span><span class="p">(</span><span class="n">sale_price</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="m">15000</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="c1"># Keeps houses that sold for more than $15,000</span><span class="w">
  </span><span class="n">group_by</span><span class="p">(</span><span class="n">neighborhood</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="c1"># Groups by neighborhood</span><span class="w">
  </span><span class="n">filter</span><span class="p">(</span><span class="n">n</span><span class="p">()</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="m">50</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="c1"># Removes neighborhoods that have had fewer than 50 home sales</span><span class="w">
  </span><span class="n">ungroup</span><span class="p">()</span><span class="w"> </span><span class="c1"># Ungroups the data</span><span class="w">
</span></code></pre></div></div> <h1 id="create-a-parallel-coordinate-chart">Create a parallel coordinate chart</h1> <ul> <li>Try mapping color to neighborhood and faceting by neighborhood. Which is most effective and why?</li> </ul> <p>Both the plots with color mapping and faceting are shown below. The faceted plot makes it easier to interpret as it avoids the crowding due to large number of neighboorhoods ploted at once.</p> <ul> <li>What does the parallel coordinate plot reveal about the neighborhoods?</li> </ul> <p>The plot reveals what variable change more for all the neighborhoods. The plots highlights the variable that are representative of majority of the variance, and possibly could be used as predictors.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">## Convert to long format with all the numeric variable in a columns (i.e, pivot longer)</span><span class="w">
</span><span class="c1"># Hint: specify the "cols" using the selection function from above: where(is.numeric)</span><span class="w">
</span><span class="n">long.house.df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">house.df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">pivot_longer</span><span class="p">(</span><span class="n">cols</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">where</span><span class="p">(</span><span class="n">is.numeric</span><span class="p">),</span><span class="w"> </span><span class="n">names_to</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"variable"</span><span class="p">,</span><span class="w"> </span><span class="n">values_to</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"value"</span><span class="p">)</span><span class="w">

</span><span class="c1">## Scale values</span><span class="w">
</span><span class="c1"># Hint: Be sure to group by variable before scaling and to ungroup after</span><span class="w">
</span><span class="n">long.house.df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">long.house.df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">group_by</span><span class="p">(</span><span class="n">variable</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">mutate</span><span class="p">(</span><span class="n">value_sc</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">scale</span><span class="p">(</span><span class="n">value</span><span class="p">))</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">ungroup</span><span class="p">()</span><span class="w">

</span><span class="c1">## Create a variable to define what neighborhoods to plot</span><span class="w">
</span><span class="n">neighborhood_to_plot</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"StoneBr"</span><span class="p">,</span><span class="w"> </span><span class="s2">"NridgHt"</span><span class="p">,</span><span class="w"> </span><span class="s2">"NoRidge"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Somerst"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Timber"</span><span class="p">)</span><span class="w">

</span><span class="n">selected_house_df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">long.house.df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">mutate</span><span class="p">(</span><span class="n">highlight</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ifelse</span><span class="p">(</span><span class="n">neighborhood</span><span class="w"> </span><span class="o">%in%</span><span class="w"> </span><span class="n">neighborhood_to_plot</span><span class="p">,</span><span class="w"> </span><span class="s2">"yes"</span><span class="p">,</span><span class="w"> </span><span class="s2">"no"</span><span class="p">))</span><span class="w">

</span><span class="c1">## Create a parallel coordinate plot of the scaled values for selected neighborhoods </span><span class="w">
</span><span class="c1"># Hint: Use the following to specify the subset of data to plot</span><span class="w">
</span><span class="c1">#  ggplot(data = long.house.df %&gt;%  filter(highlight == "yes"),</span><span class="w">
</span><span class="c1"># Hint: Consider the following to highlight the zero crossing and sales price</span><span class="w">
</span><span class="c1">#  geom_vline(xintercept = "sale_price", colour = "grey98", size = 3) +</span><span class="w">
</span><span class="c1">#  geom_hline(yintercept = 0, colour = "grey99", size = 4) +</span><span class="w">
</span></code></pre></div></div> <h2 id="mapping-color-to-neighborhood">mapping color to neighborhood</h2> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ggplot</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">selected_house_df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">  </span><span class="n">filter</span><span class="p">(</span><span class="n">highlight</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="s2">"yes"</span><span class="p">),</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">variable</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value_sc</span><span class="p">,</span><span class="w"> </span><span class="n">group</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">pid</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">neighborhood</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">(</span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.2</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_vline</span><span class="p">(</span><span class="n">xintercept</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"sale_price"</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.3</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_hline</span><span class="p">(</span><span class="n">yintercept</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.3</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">,</span><span class="w"> </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Parallel coordinate plot of house features by neighborhood"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">coord_flip</span><span class="p">()</span><span class="w"> 
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
</code></pre></div></div> <p><img src="https://nimrobotics.github.io/assets/blog/ex7_files/figure-html/unnamed-chunk-4-1.png" alt=""/></p> <h2 id="faceting-by-neighborhood">faceting by neighborhood</h2> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ggplot</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">selected_house_df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">  </span><span class="n">filter</span><span class="p">(</span><span class="n">highlight</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="s2">"yes"</span><span class="p">),</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">variable</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value_sc</span><span class="p">,</span><span class="w"> </span><span class="n">group</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">pid</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">(</span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.2</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_vline</span><span class="p">(</span><span class="n">xintercept</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"sale_price"</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.3</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_hline</span><span class="p">(</span><span class="n">yintercept</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.3</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">,</span><span class="w"> </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Parallel coordinate plot of house features by neighborhood"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">facet_wrap</span><span class="p">(</span><span class="o">~</span><span class="n">neighborhood</span><span class="p">,</span><span class="w"> </span><span class="n">scales</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"free_y"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">coord_flip</span><span class="p">()</span><span class="w">
</span></code></pre></div></div> <p><img src="https://nimrobotics.github.io/assets/blog/ex7_files/figure-html/unnamed-chunk-5-1.png" alt=""/></p> <h1 id="fit-models-to-all-the-neighborhoods-and-plot-the-estimated-intercept-and-slope">Fit models to all the neighborhoods and plot the estimated intercept and slope</h1> <ul> <li>Fit models to predict sale_price as a function of size: sale_price~x1st_flr_sf</li> <li>Use glance and tidy to extract both overall model fit data and model parameters</li> <li>Hint: After using nest-map-unnest use pivot wider to move the intercept and slope into separate columns</li> <li>Hint: After unnesting, use clean names to turn “(Intercept)” into an acceptable R name</li> <li>Hint: Use the scale to show axis labels as dollars</li> <li> <p>Map the size of the point the r.squared value. R-square value indicates how well the model can predict the data</p> </li> <li>This plot uses a linear model to abstract and aggregate data for each neighborhood. Why this might be useful and why it might be worse than useless?</li> </ul> <p>Linear model is useful to understand relation between the DV and IV and can help predict the DV. Each neighbohood has a different charachterstics, meaning a individual model might be better fit. However, this can be worse than useless if the underlying distribution of the data is not linear.</p> <ul> <li>If this regression model was guiding admission decisions for university students based on GPA rather than predicting house prices based on their size, how might the r-square value indicate potential unfairness if the points represent different socio-economic groups rather the neigborhoods.</li> </ul> <p>Usign GPA for regression model to guide admission decision, can be very unfair as it will only capture the linear relation and any student deviating from the linear fit would be at disadvatage. Further, it using just one variable (GPA) can be noisy.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">## Fit models for each neighborhood</span><span class="w">
</span><span class="n">models</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">house.df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="n">ungroup</span><span class="p">()</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
        </span><span class="n">group_by</span><span class="p">(</span><span class="n">neighborhood</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
        </span><span class="n">nest</span><span class="p">()</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
        </span><span class="n">mutate</span><span class="p">(</span><span class="w">
          </span><span class="n">fit</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">(</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="o">~</span><span class="n">lm</span><span class="p">(</span><span class="n">sale_price</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">x1st_flr_sf</span><span class="p">,</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">.x</span><span class="p">)),</span><span class="w">
          </span><span class="n">glanced</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">(</span><span class="n">fit</span><span class="p">,</span><span class="w"> </span><span class="n">broom</span><span class="o">::</span><span class="n">glance</span><span class="p">),</span><span class="w">
          </span><span class="n">tidied</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">(</span><span class="n">fit</span><span class="p">,</span><span class="w"> </span><span class="n">broom</span><span class="o">::</span><span class="n">tidy</span><span class="p">),</span><span class="w">
          </span><span class="n">augmented</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">(</span><span class="n">fit</span><span class="p">,</span><span class="w"> </span><span class="n">broom</span><span class="o">::</span><span class="n">augment</span><span class="p">)</span><span class="w">
        </span><span class="p">)</span><span class="w"> 
</span><span class="n">models</span><span class="w">
</span></code></pre></div></div> <div data-pagedtable="false"> <script data-pagedtable-source="" type="application/json">
{"columns":[{"label":["neighborhood"],"name":[1],"type":["chr"],"align":["left"]},{"label":["data"],"name":[2],"type":["list"],"align":["right"]},{"label":["fit"],"name":[3],"type":["list"],"align":["right"]},{"label":["glanced"],"name":[4],"type":["list"],"align":["right"]},{"label":["tidied"],"name":[5],"type":["list"],"align":["right"]},{"label":["augmented"],"name":[6],"type":["list"],"align":["right"]}],"data":[{"1":"NAmes","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Gilbert","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"StoneBr","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"NWAmes","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Somerst","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"NridgHt","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"NoRidge","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"SawyerW","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Sawyer","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"BrkSide","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"OldTown","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"IDOTRR","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Edwards","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"CollgCr","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Crawfor","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Mitchel","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Timber","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}}
  </script> </div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">## Unnest the model parameters </span><span class="w">
</span><span class="n">tidy_model</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">models</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">unnest</span><span class="p">(</span><span class="n">tidied</span><span class="p">)</span><span class="w">


</span><span class="c1">## Unnest the model fit</span><span class="w">
</span><span class="n">glance_model</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">models</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">unnest</span><span class="p">(</span><span class="n">glanced</span><span class="p">)</span><span class="w">


</span><span class="c1">## Use left_join to combine the parameter and fit dataframes</span><span class="w">
</span><span class="n">model_df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">left_join</span><span class="p">(</span><span class="n">tidy_model</span><span class="p">,</span><span class="w"> </span><span class="n">glance_model</span><span class="p">,</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"neighborhood"</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">select</span><span class="p">(</span><span class="n">neighborhood</span><span class="p">,</span><span class="w"> </span><span class="n">r.squared</span><span class="p">,</span><span class="w"> </span><span class="n">estimate</span><span class="p">,</span><span class="w"> </span><span class="n">std.error</span><span class="p">,</span><span class="w"> </span><span class="n">p.value.x</span><span class="p">)</span><span class="w">


</span><span class="c1">## Define highlighted neighborhoods </span><span class="w">
</span><span class="c1"># Hint: Adapt the code from the previous section</span><span class="w">
</span><span class="n">selected_model_df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">model_df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">filter</span><span class="p">(</span><span class="n">neighborhood</span><span class="w"> </span><span class="o">%in%</span><span class="w"> </span><span class="n">neighborhood_to_plot</span><span class="p">)</span><span class="w">


</span><span class="c1">## Plot the slope and intercept in a scatter plot with the size of the point mapped to the r-square</span><span class="w">
</span><span class="n">ggplot</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">selected_model_df</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">estimate</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">std.error</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">r.squared</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_point</span><span class="p">(</span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.5</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w">  </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Intercept"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Slope"</span><span class="p">,</span><span class="w"> </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Slope and intercept of linear model for each neighborhood"</span><span class="p">)</span><span class="w"> 
</span></code></pre></div></div> <p><img src="https://nimrobotics.github.io/assets/blog/ex7_files/figure-html/unnamed-chunk-6-1.png" alt=""/></p>]]></content><author><name></name></author><summary type="html"><![CDATA[In this exercise you will use a large dataset that describes houses that were...]]></summary></entry><entry><title type="html">Beginner’s guide to models in R</title><link href="https://nimrobotics.github.io/blog/2025/modelling-in-r/" rel="alternate" type="text/html" title="Beginner’s guide to models in R"/><published>2025-11-02T05:53:22+00:00</published><updated>2025-11-02T05:53:22+00:00</updated><id>https://nimrobotics.github.io/blog/2025/modelling-in-r</id><content type="html" xml:base="https://nimrobotics.github.io/blog/2025/modelling-in-r/"><![CDATA[<p>In this exercise you will use a large dataset that describes houses that were sold in Ames, Iowa. You will use these Ames house data from the package “AmesHousing” and parallel coordinate plots to understand how houses vary by neighborhood. Then you will fit models for each neighborhood based on the size of the house to assess whether a linear model fits all neighbohoods in a similar manner. A scatter plot shows whether this is the case.</p> <p>Objectives: - Create plots for exploratory data analysis of high-dimensional data - Use abstract, aggregated model-based variables (i.e., overall fit., intercept and slope of a linear model) to compare groups of data - Appreciate how visualizations might help you identify unfair machine learning models</p> <p>Submit: Complete each section chunck of code below to process the data and create the graphs. I have given you some bits of code to do some transformations that we have not discussed in class. Briefly answer the questions posed in describing what each chunk does and the meaning of the graphs.</p> <h2 id="load-packages">Load packages</h2> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">library</span><span class="p">(</span><span class="n">AmesHousing</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">janitor</span><span class="p">)</span><span class="w"> </span><span class="c1"># Useful package for converting variable names, such as "Lot shape" to "lot_shape"</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## 
## Attaching package: 'janitor'
</code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">library</span><span class="p">(</span><span class="n">scales</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">broom</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">tidyverse</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2
</code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ readr::col_factor() masks scales::col_factor()
## ✖ purrr::discard()    masks scales::discard()
## ✖ dplyr::filter()     masks stats::filter()
## ✖ dplyr::lag()        masks stats::lag()
## ℹ Use the conflicted package (&lt;http://conflicted.r-lib.org/&gt;) to force all conflicts to become errors
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">library</span><span class="p">(</span><span class="n">ggrepel</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <h1 id="load-and-clean-the-data">Load and clean the data</h1> <p>What the clean_names function does to the names?</p> <p>It convert the variable names from camelCase to snake_case i.e. removes all spaces, convert to lower case, and use underscore to separate words. example “MS SubClass” to “ms_sub_class”</p> <p>Why is “where” useful?</p> <p>It is useful to select variables based on their type. For example, where(is.numeric) will select all numeric variables.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># names(ames_raw)</span><span class="w">
</span><span class="n">house.df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ames_raw</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="n">janitor</span><span class="o">::</span><span class="n">clean_names</span><span class="p">()</span><span class="w">
</span><span class="c1"># names(house.df)</span><span class="w">

</span><span class="n">house.df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">house.df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">select</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span><span class="w"> </span><span class="n">sale_price</span><span class="p">,</span><span class="w"> </span><span class="n">neighborhood</span><span class="p">,</span><span class="w"> </span><span class="c1"># Selects specific variables</span><span class="w">
         </span><span class="n">where</span><span class="p">(</span><span class="n">is.numeric</span><span class="p">),</span><span class="w"> </span><span class="c1"># Selects numeric variables</span><span class="w">
         </span><span class="o">-</span><span class="n">order</span><span class="p">,</span><span class="w"> </span><span class="o">-</span><span class="p">(</span><span class="n">misc_val</span><span class="o">:</span><span class="n">sale_condition</span><span class="p">),</span><span class="w"> </span><span class="o">-</span><span class="n">lot_area</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="c1"># Removes specific variables</span><span class="w">
  </span><span class="n">select</span><span class="p">(</span><span class="n">which</span><span class="p">(</span><span class="n">colMeans</span><span class="p">(</span><span class="nf">is.na</span><span class="p">(</span><span class="n">.</span><span class="p">))</span><span class="w"> </span><span class="o">&lt;</span><span class="w"> </span><span class="m">0.05</span><span class="p">))</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="c1"># Removes columns with more than 5% missing values</span><span class="w">
  </span><span class="n">filter</span><span class="p">(</span><span class="n">sale_price</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="m">15000</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="c1"># Keeps houses that sold for more than $15,000</span><span class="w">
  </span><span class="n">group_by</span><span class="p">(</span><span class="n">neighborhood</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="c1"># Groups by neighborhood</span><span class="w">
  </span><span class="n">filter</span><span class="p">(</span><span class="n">n</span><span class="p">()</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="m">50</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="c1"># Removes neighborhoods that have had fewer than 50 home sales</span><span class="w">
  </span><span class="n">ungroup</span><span class="p">()</span><span class="w"> </span><span class="c1"># Ungroups the data</span><span class="w">
</span></code></pre></div></div> <h1 id="create-a-parallel-coordinate-chart">Create a parallel coordinate chart</h1> <ul> <li>Try mapping color to neighborhood and faceting by neighborhood. Which is most effective and why?</li> </ul> <p>Both the plots with color mapping and faceting are shown below. The faceted plot makes it easier to interpret as it avoids the crowding due to large number of neighboorhoods ploted at once.</p> <ul> <li>What does the parallel coordinate plot reveal about the neighborhoods?</li> </ul> <p>The plot reveals what variable change more for all the neighborhoods. The plots highlights the variable that are representative of majority of the variance, and possibly could be used as predictors.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">## Convert to long format with all the numeric variable in a columns (i.e, pivot longer)</span><span class="w">
</span><span class="c1"># Hint: specify the "cols" using the selection function from above: where(is.numeric)</span><span class="w">
</span><span class="n">long.house.df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">house.df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">pivot_longer</span><span class="p">(</span><span class="n">cols</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">where</span><span class="p">(</span><span class="n">is.numeric</span><span class="p">),</span><span class="w"> </span><span class="n">names_to</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"variable"</span><span class="p">,</span><span class="w"> </span><span class="n">values_to</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"value"</span><span class="p">)</span><span class="w">

</span><span class="c1">## Scale values</span><span class="w">
</span><span class="c1"># Hint: Be sure to group by variable before scaling and to ungroup after</span><span class="w">
</span><span class="n">long.house.df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">long.house.df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">group_by</span><span class="p">(</span><span class="n">variable</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">mutate</span><span class="p">(</span><span class="n">value_sc</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">scale</span><span class="p">(</span><span class="n">value</span><span class="p">))</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">ungroup</span><span class="p">()</span><span class="w">

</span><span class="c1">## Create a variable to define what neighborhoods to plot</span><span class="w">
</span><span class="n">neighborhood_to_plot</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"StoneBr"</span><span class="p">,</span><span class="w"> </span><span class="s2">"NridgHt"</span><span class="p">,</span><span class="w"> </span><span class="s2">"NoRidge"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Somerst"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Timber"</span><span class="p">)</span><span class="w">

</span><span class="n">selected_house_df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">long.house.df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">mutate</span><span class="p">(</span><span class="n">highlight</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ifelse</span><span class="p">(</span><span class="n">neighborhood</span><span class="w"> </span><span class="o">%in%</span><span class="w"> </span><span class="n">neighborhood_to_plot</span><span class="p">,</span><span class="w"> </span><span class="s2">"yes"</span><span class="p">,</span><span class="w"> </span><span class="s2">"no"</span><span class="p">))</span><span class="w">

</span><span class="c1">## Create a parallel coordinate plot of the scaled values for selected neighborhoods </span><span class="w">
</span><span class="c1"># Hint: Use the following to specify the subset of data to plot</span><span class="w">
</span><span class="c1">#  ggplot(data = long.house.df %&gt;%  filter(highlight == "yes"),</span><span class="w">
</span><span class="c1"># Hint: Consider the following to highlight the zero crossing and sales price</span><span class="w">
</span><span class="c1">#  geom_vline(xintercept = "sale_price", colour = "grey98", size = 3) +</span><span class="w">
</span><span class="c1">#  geom_hline(yintercept = 0, colour = "grey99", size = 4) +</span><span class="w">
</span></code></pre></div></div> <h2 id="mapping-color-to-neighborhood">mapping color to neighborhood</h2> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ggplot</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">selected_house_df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">  </span><span class="n">filter</span><span class="p">(</span><span class="n">highlight</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="s2">"yes"</span><span class="p">),</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">variable</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value_sc</span><span class="p">,</span><span class="w"> </span><span class="n">group</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">pid</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">neighborhood</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">(</span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.2</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_vline</span><span class="p">(</span><span class="n">xintercept</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"sale_price"</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.3</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_hline</span><span class="p">(</span><span class="n">yintercept</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.3</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">,</span><span class="w"> </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Parallel coordinate plot of house features by neighborhood"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">coord_flip</span><span class="p">()</span><span class="w"> 
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
</code></pre></div></div> <p><img src="https://nimrobotics.github.io/assets/blog/ex7_files/figure-html/unnamed-chunk-4-1.png" alt=""/></p> <h2 id="faceting-by-neighborhood">faceting by neighborhood</h2> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ggplot</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">selected_house_df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">  </span><span class="n">filter</span><span class="p">(</span><span class="n">highlight</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="s2">"yes"</span><span class="p">),</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">variable</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">value_sc</span><span class="p">,</span><span class="w"> </span><span class="n">group</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">pid</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">(</span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.2</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_vline</span><span class="p">(</span><span class="n">xintercept</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"sale_price"</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.3</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_hline</span><span class="p">(</span><span class="n">yintercept</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">colour</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"red"</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.3</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">,</span><span class="w"> </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Parallel coordinate plot of house features by neighborhood"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">facet_wrap</span><span class="p">(</span><span class="o">~</span><span class="n">neighborhood</span><span class="p">,</span><span class="w"> </span><span class="n">scales</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"free_y"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">coord_flip</span><span class="p">()</span><span class="w">
</span></code></pre></div></div> <p><img src="https://nimrobotics.github.io/assets/blog/ex7_files/figure-html/unnamed-chunk-5-1.png" alt=""/></p> <h1 id="fit-models-to-all-the-neighborhoods-and-plot-the-estimated-intercept-and-slope">Fit models to all the neighborhoods and plot the estimated intercept and slope</h1> <ul> <li>Fit models to predict sale_price as a function of size: sale_price~x1st_flr_sf</li> <li>Use glance and tidy to extract both overall model fit data and model parameters</li> <li>Hint: After using nest-map-unnest use pivot wider to move the intercept and slope into separate columns</li> <li>Hint: After unnesting, use clean names to turn “(Intercept)” into an acceptable R name</li> <li>Hint: Use the scale to show axis labels as dollars</li> <li> <p>Map the size of the point the r.squared value. R-square value indicates how well the model can predict the data</p> </li> <li>This plot uses a linear model to abstract and aggregate data for each neighborhood. Why this might be useful and why it might be worse than useless?</li> </ul> <p>Linear model is useful to understand relation between the DV and IV and can help predict the DV. Each neighbohood has a different charachterstics, meaning a individual model might be better fit. However, this can be worse than useless if the underlying distribution of the data is not linear.</p> <ul> <li>If this regression model was guiding admission decisions for university students based on GPA rather than predicting house prices based on their size, how might the r-square value indicate potential unfairness if the points represent different socio-economic groups rather the neigborhoods.</li> </ul> <p>Usign GPA for regression model to guide admission decision, can be very unfair as it will only capture the linear relation and any student deviating from the linear fit would be at disadvatage. Further, it using just one variable (GPA) can be noisy.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">## Fit models for each neighborhood</span><span class="w">
</span><span class="n">models</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">house.df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="n">ungroup</span><span class="p">()</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
        </span><span class="n">group_by</span><span class="p">(</span><span class="n">neighborhood</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
        </span><span class="n">nest</span><span class="p">()</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
        </span><span class="n">mutate</span><span class="p">(</span><span class="w">
          </span><span class="n">fit</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">(</span><span class="n">data</span><span class="p">,</span><span class="w"> </span><span class="o">~</span><span class="n">lm</span><span class="p">(</span><span class="n">sale_price</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">x1st_flr_sf</span><span class="p">,</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">.x</span><span class="p">)),</span><span class="w">
          </span><span class="n">glanced</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">(</span><span class="n">fit</span><span class="p">,</span><span class="w"> </span><span class="n">broom</span><span class="o">::</span><span class="n">glance</span><span class="p">),</span><span class="w">
          </span><span class="n">tidied</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">(</span><span class="n">fit</span><span class="p">,</span><span class="w"> </span><span class="n">broom</span><span class="o">::</span><span class="n">tidy</span><span class="p">),</span><span class="w">
          </span><span class="n">augmented</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">map</span><span class="p">(</span><span class="n">fit</span><span class="p">,</span><span class="w"> </span><span class="n">broom</span><span class="o">::</span><span class="n">augment</span><span class="p">)</span><span class="w">
        </span><span class="p">)</span><span class="w"> 
</span><span class="n">models</span><span class="w">
</span></code></pre></div></div> <div data-pagedtable="false"> <script data-pagedtable-source="" type="application/json">
{"columns":[{"label":["neighborhood"],"name":[1],"type":["chr"],"align":["left"]},{"label":["data"],"name":[2],"type":["list"],"align":["right"]},{"label":["fit"],"name":[3],"type":["list"],"align":["right"]},{"label":["glanced"],"name":[4],"type":["list"],"align":["right"]},{"label":["tidied"],"name":[5],"type":["list"],"align":["right"]},{"label":["augmented"],"name":[6],"type":["list"],"align":["right"]}],"data":[{"1":"NAmes","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Gilbert","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"StoneBr","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"NWAmes","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Somerst","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"NridgHt","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"NoRidge","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"SawyerW","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Sawyer","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"BrkSide","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"OldTown","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"IDOTRR","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Edwards","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"CollgCr","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Crawfor","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Mitchel","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"},{"1":"Timber","2":"<tibble[,31]>","3":"<S3: lm>","4":"<tibble[,12]>","5":"<tibble[,5]>","6":"<tibble[,8]>"}],"options":{"columns":{"min":{},"max":[10]},"rows":{"min":[10],"max":[10]},"pages":{}}}
  </script> </div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">## Unnest the model parameters </span><span class="w">
</span><span class="n">tidy_model</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">models</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">unnest</span><span class="p">(</span><span class="n">tidied</span><span class="p">)</span><span class="w">


</span><span class="c1">## Unnest the model fit</span><span class="w">
</span><span class="n">glance_model</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">models</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">unnest</span><span class="p">(</span><span class="n">glanced</span><span class="p">)</span><span class="w">


</span><span class="c1">## Use left_join to combine the parameter and fit dataframes</span><span class="w">
</span><span class="n">model_df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">left_join</span><span class="p">(</span><span class="n">tidy_model</span><span class="p">,</span><span class="w"> </span><span class="n">glance_model</span><span class="p">,</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"neighborhood"</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">select</span><span class="p">(</span><span class="n">neighborhood</span><span class="p">,</span><span class="w"> </span><span class="n">r.squared</span><span class="p">,</span><span class="w"> </span><span class="n">estimate</span><span class="p">,</span><span class="w"> </span><span class="n">std.error</span><span class="p">,</span><span class="w"> </span><span class="n">p.value.x</span><span class="p">)</span><span class="w">


</span><span class="c1">## Define highlighted neighborhoods </span><span class="w">
</span><span class="c1"># Hint: Adapt the code from the previous section</span><span class="w">
</span><span class="n">selected_model_df</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">model_df</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> 
  </span><span class="n">filter</span><span class="p">(</span><span class="n">neighborhood</span><span class="w"> </span><span class="o">%in%</span><span class="w"> </span><span class="n">neighborhood_to_plot</span><span class="p">)</span><span class="w">


</span><span class="c1">## Plot the slope and intercept in a scatter plot with the size of the point mapped to the r-square</span><span class="w">
</span><span class="n">ggplot</span><span class="p">(</span><span class="n">data</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">selected_model_df</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">estimate</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">std.error</span><span class="p">,</span><span class="w"> </span><span class="n">size</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">r.squared</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_point</span><span class="p">(</span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.5</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w">  </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Intercept"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Slope"</span><span class="p">,</span><span class="w"> </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Slope and intercept of linear model for each neighborhood"</span><span class="p">)</span><span class="w"> 
</span></code></pre></div></div> <p><img src="https://nimrobotics.github.io/assets/blog/ex7_files/figure-html/unnamed-chunk-6-1.png" alt=""/></p>]]></content><author><name></name></author><summary type="html"><![CDATA[In this exercise you will use a large dataset that describes houses that were...]]></summary></entry><entry><title type="html">Some tools generated using LLMs (not slop)</title><link href="https://nimrobotics.github.io/blog/2025/tools-gen-with-llms/" rel="alternate" type="text/html" title="Some tools generated using LLMs (not slop)"/><published>2025-09-30T00:00:00+00:00</published><updated>2025-09-30T00:00:00+00:00</updated><id>https://nimrobotics.github.io/blog/2025/tools-gen-with-llms</id><content type="html" xml:base="https://nimrobotics.github.io/blog/2025/tools-gen-with-llms/"><![CDATA[<p>I, like many others, like to think of large language models (LLMs) as “slop”—a messy, unstructured mix of data that may or may not produce coherent and useful outputs. I will write a detailed post on this later, but in short LLMs are just fancy token predictors, and can be really good at menial (not requiring thinking/creativity), common (there should be lots of examples on the internet), short one-time tasks (e.g., summarization, translation, etc.).</p> <p>I played with claude and claude code to generate a few simple tools</p> <ul> <li><a href="/qr">QR Code Generator</a> - generates QR codes from text input</li> <li><a href="/image_converter">Image converter</a> - converts images between different formats (e.g., PNG to JPG)</li> <li><a href="/filediff">File difference checker</a> - compares two text files and highlights the differences</li> </ul> <p>All of these tools process the data within the browser, so no data is sent to any server (100% private, no ads). I will be updating these tools over time!</p>]]></content><author><name></name></author><category term="fun"/><category term="LLM,"/><category term="AI"/><summary type="html"><![CDATA[I, like many others, like to think of large language models (LLMs) as “slop”—a messy, unstructured mix of data that may or may not produce coherent and useful outputs. I will write a detailed post on this later, but in short LLMs are just fancy token predictors, and can be really good at menial (not requiring thinking/creativity), common (there should be lots of examples on the internet), short one-time tasks (e.g., summarization, translation, etc.).]]></summary></entry><entry><title type="html">Preprocessing functional Near Infrared Spectroscopy (fNIRS) Data</title><link href="https://nimrobotics.github.io/blog/2025/fnirs-preprocessing/" rel="alternate" type="text/html" title="Preprocessing functional Near Infrared Spectroscopy (fNIRS) Data"/><published>2025-06-28T00:00:00+00:00</published><updated>2025-06-28T00:00:00+00:00</updated><id>https://nimrobotics.github.io/blog/2025/fnirs-preprocessing</id><content type="html" xml:base="https://nimrobotics.github.io/blog/2025/fnirs-preprocessing/"><![CDATA[<p><strong>Contents</strong></p> <ul id="markdown-toc"> <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li> <li><a href="#raw-data" id="markdown-toc-raw-data">Raw Data</a></li> <li><a href="#removing-noise" id="markdown-toc-removing-noise">Removing noise</a> <ul> <li><a href="#motion-artifact-detection-and-correction" id="markdown-toc-motion-artifact-detection-and-correction">Motion Artifact Detection and Correction</a></li> <li><a href="#removing-physiological-noise" id="markdown-toc-removing-physiological-noise">Removing physiological noise</a></li> </ul> </li> <li><a href="#convert-optical-density-to-concentration" id="markdown-toc-convert-optical-density-to-concentration">Convert Optical Density to Concentration</a> <ul> <li><a href="#beer-lambert-law" id="markdown-toc-beer-lambert-law">Beer-Lambert Law</a></li> <li><a href="#modified-beer-lambert-law" id="markdown-toc-modified-beer-lambert-law">Modified Beer-Lambert Law</a></li> </ul> </li> <li><a href="#feature-extraction-and-averaging" id="markdown-toc-feature-extraction-and-averaging">Feature Extraction and Averaging</a></li> </ul> <h2 id="introduction">Introduction</h2> <p>Functional Near Infrared Spectroscopy (fNIRS) is a non-invasive brain imaging technique that measures the changes in the concentration of oxyhemoglobin and deoxyhemoglobin in the brain. The way it works is by shining near-infrared light through the skull and into the brain tissue. This light is absorbed differently by oxygenated and deoxygenated hemoglobin, allowing researchers to infer changes in blood flow and oxygenation in specific brain regions.</p> <p>fNIRS is particularly useful for studying brain activity in naturalistic settings, as it is portable and can be used with participants who are moving or engaged in various activities. However, like any other neuroimaging technique, fNIRS data requires careful preprocessing to ensure that the signals are clean and interpretable. Broadly speaking, fNIRS has better spatial resolution than EEG but lower temporal resolution. I recommend reading the Mehta et al. (2023) <a class="citation" href="#mehta2013neuroergonomics">(Mehta &amp; Parasuraman, 2013)</a> paper for a more detailed comparison of fNIRS with other neuroimaging techniques like EEG, fMRI, and MEG.</p> <pre><code class="language-mermaid">flowchart TD
    A[Raw Light Intensity Signal] --&gt; C[Convert to Optical Density]
    
    C --&gt; E[Motion Artifact Detection]
    E --&gt; F[Motion Correction]
    F --&gt; G[Bandpass Filter]
    
    G --&gt; G1[Heartbeat&lt;br/&gt;~1-2 Hz]
    G --&gt; G2[Respiration&lt;br/&gt;~0.4 Hz]
    G --&gt; G3[Mayer Waves&lt;br/&gt;Blood Pressure&lt;br/&gt;~0.1 Hz]
    
    G1 --&gt; H[Filtered Signal]
    G2 --&gt; H
    G3 --&gt; H
    
    H --&gt; I[Convert Optical Density&lt;br/&gt;to Concentration]
    I --&gt; J[Feature Extraction &amp; Averaging]
    
    J --&gt; K1[HbO max]
    J --&gt; K2[HbO mean]
    J --&gt; K3[Connectivity Measures]
    J --&gt; K4[Other Features]
    
    K1 --&gt; L[Statistical Analysis]
    K2 --&gt; L
    K3 --&gt; L
    K4 --&gt; L

</code></pre> <p><strong>Note:</strong></p> <ul> <li>fNIRS and EEG are more portable and suitable for naturalistic or pediatric studies.</li> <li>fMRI and MEG offer higher spatial resolution but are less portable and more expensive.</li> </ul> <h2 id="raw-data">Raw Data</h2> <p>The raw data from fNIRS is light intensity signals recorded from the scalp. These signals are typically collected from multiple channels, each corresponding to a specific location on the head.</p> <p>Typically, the recording is done with two different wavelengths of light (usually around 690 nm and 830 nm), which allows for the measurement of the signals from oxyhemoglobin and deoxyhemoglobin.</p> <p>The light intensity is converted to optical density (OD) using the following formula:</p> \[OD = -\log_{10}\left(\frac{I_1}{I_0}\right) = \log\left(\frac{I_0}{I_1}\right)\] <p>where \(I_1\) is the intensity of the light received by the detector and \(I_0\) is the intensity of the light emitted by the source.</p> <h2 id="removing-noise">Removing noise</h2> <p>The raw fNIRS data is often contaminated with noise from various sources, including motion artifacts, physiological signals (like heartbeat and respiration), and environmental noise. To clean the data, several preprocessing steps are typically applied.</p> <h3 id="motion-artifact-detection-and-correction">Motion Artifact Detection and Correction</h3> <p>Motion artifacts are one of the most significant sources of noise in fNIRS data. They can arise from head movements, muscle contractions, or other physical activities during the recording. To detect and correct for motion artifacts, several methods can be employed.</p> <p>Some motion correction methods require explicitly identifying the motion artifacts, while others use statistical methods to estimate and remove them. In Homer3, a popular fNIRS data analysis software, the motion artifact detection can be done using the <code class="language-plaintext highlighter-rouge">hmrMotionArtifactByChannel</code> function. This function uses a threshold-based approach to identify motion artifacts in the data. Typical parameters for this function include: tMotion=0.5, tMask=2, STDEVthresh=20, AMPthresh=0.5. Another option is to use <code class="language-plaintext highlighter-rouge">hmrMotionArtifact</code> function</p> <p>These are followed by motion correction algorithms, Homer3 provides several options for motion correction, including:</p> <ul> <li><code class="language-plaintext highlighter-rouge">hmrR_MotionCorrectCbsi</code>: This function implements a motion correction algorithm based on the detected motion artifacts. function data_dc = hmrR_MotionCorrectCbsi(data_dc, mlActAuto, turnon)</li> <li><code class="language-plaintext highlighter-rouge">hmrR_MotionCorrectPCA</code>: This function implements a motion correction algorithm using Principal Component Analysis (PCA). [data_d, svs, nSV] = hmrR_MotionCorrectPCA(data_d, mlActMan, mlActAuto, tIncMan, tIncAuto, nSV)</li> </ul> <p>There are several other motion correction algorithms available in Homer3, such as <code class="language-plaintext highlighter-rouge">hmrR_MotionCorrectWavelet</code> and <code class="language-plaintext highlighter-rouge">hmrR_MotionCorrectSpline</code>. Each of these algorithms has its own strengths and weaknesses, and the choice of algorithm depends on the specific characteristics of the data and the research question being addressed. I highly recommend reading the associated papers for each algorithm for a deeper understanding of performance. I also recommend, playing around with several of these algorithms to see which one works best for your data. <strong>Always visualize</strong> the results of the motion correction to ensure that the artifacts have been effectively removed.</p> <p>A lot of these alorithms require the user to manually select the motion artifacts, which can be time-consuming and subjective. However, there are automated methods that can help with this process. One such method is Temporal Derivative Distribution Repair (TDDR) <a class="citation" href="#fishburn2019temporal">(Fishburn et al., 2019)</a>, which is a data-driven approach that uses the temporal derivative of the signal to identify and correct motion artifacts. Currently, TDDR is not implemented in Homer3, Matlab and Python, but it is available in the <a href="https://github.com/frankfishburn/TDDR">https://github.com/frankfishburn/TDDR</a>. In addition, it is also implemented in the <a href="https://mne.tools/stable/generated/mne.preprocessing.nirs.temporal_derivative_distribution_repair.html">MNE-Python</a> package, which is a popular Python package for neuroimaging data analysis.</p> <h3 id="removing-physiological-noise">Removing physiological noise</h3> <p>Any signal can be decomposed into its frequency components using Fourier Transform. PSD (Power Spectral Density) can be used to analyze the frequency content of the signal. The PSD can be estimated using various methods, such as Welch’s method (<code class="language-plaintext highlighter-rouge">scipy.signal.welch</code> in Python). Sources of physiological noise include heartbeat, respiration, and Mayer waves (which are related to blood pressure fluctuations).</p> <p>Frequency breakdown of fNIRS signals:</p> <ul> <li>Heartbeat: ~1-2 Hz</li> <li>Respiration: ~0.4 Hz</li> <li>Mayer waves (blood pressure): ~0.1 Hz</li> <li>Useful fNIRS signals: typically below 0.1 Hz</li> </ul> <p align="center"> <img src="/assets/img/blog/fnirsp/psd.png" alt="Power Spectral Density Diagram" width="80%"/> <br/> <em>Figure 1: Example Power Spectral Density (PSD) plot of fNIRS data (x-axis is frequency in Hz), illustrating the frequency components corresponding to heartbeat, respiration, Mayer waves, and the low-frequency band of interest.</em> </p> <p><strong>Bandpass Filtering</strong>: A bandpass filter can be applied to isolate the frequency range of interest. Typically, a bandpass filter with a cutoff frequency of 0.01 Hz to 0.5 Hz is used.</p> <p align="center"> <img src="/assets/img/blog/fnirsp/psdfilt.png" alt="Power Spectral Density Diagram" width="80%"/> <br/> <em>Figure 2: Filtered Power Spectral Density (PSD) plot of fNIRS data (x-axis is frequency in Hz).</em> </p> <h2 id="convert-optical-density-to-concentration">Convert Optical Density to Concentration</h2> <p>Once the optical density signals are cleaned, they can be converted to concentration changes of oxyhemoglobin (HbO) and deoxyhemoglobin (HbR) using the modified Beer-Lambert law. This law relates the changes in optical density to the concentration changes of hemoglobin in the brain.</p> <h3 id="beer-lambert-law">Beer-Lambert Law</h3> <p align="center"> <img src="/assets/img/blog/fnirsp/bll.png" alt="Beer-Lambert Law Diagram" width="60%"/> <br/> <em>Figure 3: Beer-Lambert Law Diagram, illustrating the relationship between light intensity, optical density, and concentration of absorbing species.</em> </p> <p>The Beer-Lambert law describes the relationship between the intensity of light absorbed by a substance and its concentration. It is expressed as:</p> \[OD(\lambda) = \log\left(\frac{I_{0\lambda}}{I_{1\lambda}}\right) = \alpha(\lambda) \cdot c \cdot l\] <p>where:</p> <ul> <li>\(OD(\lambda)\) is the optical density at wavelength \(\lambda\)</li> <li>\(\alpha(\lambda)\) is the molar extinction coefficient (also called absorption coefficient or specific absorption coefficient) at wavelength \(\lambda\). It quantifies how strongly a particular substance (e.g., HbO\(_2\) or HHb) absorbs light at that wavelength.</li> <li>\(c\) is the concentration of the absorbing species (HbO or HbR)</li> <li>\(l\) is the path length of the light through the tissue</li> </ul> <h3 id="modified-beer-lambert-law">Modified Beer-Lambert Law</h3> <p>Why modify? The original Beer-Lambert law assumes a clear medium with no scattering, which is not the case in biological tissues. In tissues, light scattering significantly affects the path length of light, so we need to account for this. The modified Beer-Lambert law incorporates a correction factor for scattering, known as the differential path length factor (DPF), and a term for scattering effects, denoted as \(S(\lambda)\) <a class="citation" href="#baker2014modified">(Baker et al., 2014)</a>. The modified Beer-Lambert law is expressed as:</p> \[OD(\lambda) = \log\left(\frac{I_{0\lambda}}{I_{1\lambda}}\right) = \alpha(\lambda) \cdot c \cdot l \cdot DPF + S(\lambda)\] <p>where:</p> <ul> <li>\(OD(\lambda)\) is the optical density at wavelength \(\lambda\)</li> <li>\(\alpha(\lambda)\) is the molar extinction coefficient (also called absorption coefficient or specific absorption coefficient) at wavelength \(\lambda\)</li> <li>\(c\) is the concentration of the absorbing species (e.g., HbO\(_2\) or HHb)</li> <li>\(l\) is the path length of the light through the tissue</li> <li>\(DPF\) is the differential path length factor, which accounts for the scattering of light in biological tissues</li> <li>\(S(\lambda)\) is a term that accounts for scattering effects and other factors that may affect the light absorption in biological tissues.</li> </ul> <p>For continuous wave (CW) fNIRS, the above equation is typically expressed in terms of changes in optical density, i.e., the difference between two consecutive time points (\(t_n\) and \(t_{n-1}\)):</p> \[\Delta OD(\lambda) = \alpha(\lambda) \cdot \Delta c \cdot l \cdot DPF\] <p>Notice the \(S(\lambda)\) term gets cancelled out when calculating the change in optical density.</p> <p><strong>Partial Volume Factor (PVF)</strong>: PVF is applied as another correction factor to adjust for the fraction of the path that actually goes through the brain tissue (i.e. excluding scalp, skull, and other tissues).</p> <p>The equation above now becomes:</p> \[\Delta OD(\lambda) = \alpha(\lambda) \cdot \Delta c \cdot l \cdot DPF \cdot PVF\] <p>These correction factors (DPF and PVF) are combined into a single term called the partial pathlength factor (PPF) <a class="citation" href="#whiteman2018investigation">(Whiteman et al., 2018)</a>, which is often used in fNIRS studies:</p> \[\Delta OD(\lambda) = \alpha(\lambda) \cdot \Delta c \cdot l \cdot PPF\] <p>where: \(PPF = DPF \cdot PVF\)</p> <p>Note: all these correction factors are function of wavelength \(\lambda\), so they can be expressed as \(\alpha(\lambda)\), \(DPF(\lambda)\), and \(PVF(\lambda)\).</p> <blockquote> <p>“Typical value is ~6 for each wavelength if the absorption change is uniform over the volume of tissue measured. To approximate the partial volume effect of a small localized absorption change within an adult human head, this value could be as small as 0.1. Convention is becoming to set ppf=1 and to not divide by the source-detector separation such that the resultant “concentration” is in units of Molar mm (or Molar cm if those are the spatial units). This is becoming wide spread in the literature but there is no fixed citation. Use a value of 1 to choose this option.” - <a href="https://github.com/BUNPC/Homer3/blob/master/FuncRegistry/UserFunctions/hmrR_OD2Conc.m">hmrR_OD2Conc</a> docstring.</p> </blockquote> <blockquote> <p>MNE project also recommends (see <a href="https://mne.discourse.group/t/why-ppf-and-not-dpf-in-mne-preprocessing-nirs-beer-lambert-law/4373">this</a> and <a href="https://github.com/mne-tools/mne-python/pull/9843">this</a>) using a PPF of 6 and it sets PPF to 6 by <a href="https://github.com/mne-tools/mne-python/blob/maint/1.9/mne/preprocessing/nirs/_beer_lambert_law.py#L88">default</a> in current versions of MNE-Python.</p> </blockquote> <p>Caveat: The path length correction factors depend of the 3D geometry of the head/brain and can vary with age, sex, and individual anatomy. This introduces few challenges <a class="citation" href="#whiteman2018investigation">(Whiteman et al., 2018)</a>.</p> <ul> <li>The values vary with region of the brain for the same individual. This means that comparing results across different brain regions or individuals may be problematic.</li> <li><strong>Sex differences</strong>: The path length correction factors can differ between males and females, and can be a confounding factor in studies comparing sexes.</li> </ul> <p>Addressing above is out of scope for this post, but I recommend reading the Whiteman et al. (2018) <a class="citation" href="#whiteman2018investigation">(Whiteman et al., 2018)</a> paper for a detailed discussion on this topic.</p> <p>We can now express the change in optical density as a function of the changes in concentration of oxyhemoglobin and deoxyhemoglobin, which is the main goal of fNIRS data preprocessing. The optical density is property of the constituents of the blood, specifically oxyhemoglobin (HbO\(_2\)) and deoxyhemoglobin (HHb).</p> \[\Delta OD(\lambda) = \log\left(\frac{I_{0\lambda}}{I_{1\lambda}}\right) = \sum_{n} \alpha_n(\lambda) \cdot \Delta c_n \cdot l \cdot PPF\] <p>where, \(n\) is the index of the absorbing species (e.g., oxyhemoglobin and deoxyhemoglobin).</p> \[\Delta OD(\lambda) = \alpha_{\text{HBO}_2}(\lambda) \cdot \Delta c_{\text{HBO}_2} \cdot l \cdot PPF + \alpha_{\text{HHB}}(\lambda) \cdot \Delta c_{\text{HHB}} \cdot l \cdot PPF\] <p>So far we have one equation (above) and two unknowns (\(\Delta c_{\text{HBO}_2}\) and \(\Delta c_{\text{HHB}}\)). To solve for these unknowns, we need to use the fact that we typically measure the changes in optical density at two different wavelengths (e.g., \(\lambda_1\) and \(\lambda_2\)). Since fNIRS typically uses two wavelengths, we can express the changes in optical density for each wavelength as follows:</p> \[\Delta OD(\lambda_1) = \alpha_{\text{HBO}_2}(\lambda_1) \cdot \Delta c_{\text{HBO}_2} \cdot l \cdot PPF + \alpha_{\text{HHB}}(\lambda_1) \cdot \Delta c_{\text{HHB}} \cdot l \cdot PPF\] \[\Delta OD(\lambda_2) = \alpha_{\text{HBO}_2}(\lambda_2) \cdot \Delta c_{\text{HBO}_2} \cdot l \cdot PPF + \alpha_{\text{HHB}}(\lambda_2) \cdot \Delta c_{\text{HHB}} \cdot l \cdot PPF\] <p>The above pair of equations can be solved for the changes in concentration of oxyhemoglobin (\(\Delta c_{\text{HBO}_2}\)) and deoxyhemoglobin (\(\Delta c_{\text{HHB}}\)), since all other variables are known or can be measured.</p> <h2 id="feature-extraction-and-averaging">Feature Extraction and Averaging</h2> <p>After converting the optical density signals to concentration changes, the next step is to extract features from the data. Common features include:</p> <ul> <li>Peak activation of oxyhemoglobin (HbO max)</li> <li>Mean concentration change of oxyhemoglobin (HbO mean)</li> <li>Functional connectivity measures</li> <li>Effective connectivity measures</li> <li>Graph-based measures</li> </ul> <p>I recommend reading <a class="citation" href="#yucel2021best">(Yücel et al., 2021)</a> for a detailed overview of the best practices for fNIRS data preprocessing.</p>]]></content><author><name></name></author><category term="research"/><category term="fnirs,"/><category term="neuroimaging,"/><category term="preprocessing"/><summary type="html"><![CDATA[Contents Introduction Introduction]]></summary></entry><entry><title type="html">Recurrence Quantification Analysis (RQA), CRQA and physiological data</title><link href="https://nimrobotics.github.io/blog/2025/rqa-crqa-ecg/" rel="alternate" type="text/html" title="Recurrence Quantification Analysis (RQA), CRQA and physiological data"/><published>2025-01-01T00:00:00+00:00</published><updated>2025-01-01T00:00:00+00:00</updated><id>https://nimrobotics.github.io/blog/2025/rqa-crqa-ecg</id><content type="html" xml:base="https://nimrobotics.github.io/blog/2025/rqa-crqa-ecg/"><![CDATA[<p><strong>Contents</strong></p> <ul id="markdown-toc"> <li><a href="#why-non-linear-methods" id="markdown-toc-why-non-linear-methods">Why non-linear methods?</a></li> <li><a href="#recurrence-quantification-analysis-rqa" id="markdown-toc-recurrence-quantification-analysis-rqa">Recurrence Quantification Analysis (RQA)</a> <ul> <li><a href="#performing-rqa-on-sine-wave" id="markdown-toc-performing-rqa-on-sine-wave">Performing RQA on Sine Wave</a></li> </ul> </li> <li><a href="#making-sense-of-the-paramters" id="markdown-toc-making-sense-of-the-paramters">Making sense of the paramters</a> <ul> <li><a href="#embedding-dimension" id="markdown-toc-embedding-dimension">Embedding dimension</a></li> <li><a href="#time-delay" id="markdown-toc-time-delay">Time delay</a></li> <li><a href="#radius" id="markdown-toc-radius">Radius</a></li> </ul> </li> <li><a href="#making-sense-of-the-metrics" id="markdown-toc-making-sense-of-the-metrics">Making sense of the metrics</a> <ul> <li><a href="#recurrence-rate" id="markdown-toc-recurrence-rate">Recurrence rate</a></li> <li><a href="#determinism" id="markdown-toc-determinism">Determinism</a></li> <li><a href="#entropy" id="markdown-toc-entropy">Entropy</a></li> </ul> </li> <li><a href="#embedding-dimension-greater-than-1" id="markdown-toc-embedding-dimension-greater-than-1">Embedding dimension greater than 1</a> <ul> <li><a href="#average-mutual-information" id="markdown-toc-average-mutual-information">Average mutual information</a></li> <li><a href="#false-nearest-neighbors" id="markdown-toc-false-nearest-neighbors">False Nearest Neighbors</a></li> <li><a href="#phase-space-reconstruction" id="markdown-toc-phase-space-reconstruction">Phase Space Reconstruction</a></li> <li><a href="#rqa-on-lorenz-attractor" id="markdown-toc-rqa-on-lorenz-attractor">RQA on Lorenz Attractor</a></li> <li><a href="#crqa-on-lorenz-attractor" id="markdown-toc-crqa-on-lorenz-attractor">CRQA on Lorenz Attractor</a></li> </ul> </li> <li><a href="#electrocardiogram-ecg-data-analysis" id="markdown-toc-electrocardiogram-ecg-data-analysis">Electrocardiogram (ECG) Data Analysis</a> <ul> <li><a href="#ami-on-ecg-data" id="markdown-toc-ami-on-ecg-data">AMI on ECG Data</a></li> <li><a href="#fnn-on-ecg-data" id="markdown-toc-fnn-on-ecg-data">FNN on ECG Data</a></li> <li><a href="#rqa-on-ecg-data" id="markdown-toc-rqa-on-ecg-data">RQA on ECG Data</a></li> <li><a href="#crqa-on-ecg-data" id="markdown-toc-crqa-on-ecg-data">CRQA on ECG Data</a></li> </ul> </li> <li><a href="#interbeat-interval-data-analysis" id="markdown-toc-interbeat-interval-data-analysis">Interbeat Interval Data Analysis</a> <ul> <li><a href="#ami-on-ibi-data" id="markdown-toc-ami-on-ibi-data">AMI on IBI Data</a></li> <li><a href="#fnn-on-ibi-data" id="markdown-toc-fnn-on-ibi-data">FNN on IBI Data</a></li> <li><a href="#rqa-on-ibi-data" id="markdown-toc-rqa-on-ibi-data">RQA on IBI Data</a></li> <li><a href="#crqa-on-ibi-data" id="markdown-toc-crqa-on-ibi-data">CRQA on IBI Data</a></li> </ul> </li> <li><a href="#multi-dimensional-rqa-mdrqa" id="markdown-toc-multi-dimensional-rqa-mdrqa">Multi-dimensional RQA (MdRQA)</a></li> <li><a href="#multi-dimensional-crqa-mdcrqa" id="markdown-toc-multi-dimensional-crqa-mdcrqa">Multi-dimensional CRQA (MdCRQA)</a></li> <li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li> </ul> <p>In this post, we will discuss the Cross Recurrence Quantification Analysis (CRQA) method. CRQA is a method to quantify the degree of similarity between two time series. It is a generalization of Recurrence Quantification Analysis (RQA) to two time series. We will start by discussing the basic idea behind RQA and then move on to CRQA using illustrative examples. We also discuss the multi-dimensional generalization of RQA and CRQA, which allows us to analyze the similarity between multiple time series.</p> <h2 id="why-non-linear-methods">Why non-linear methods?</h2> <p>Some behaviorial interactions are very complex and dynamic in nature making it hard to capture them using traditional methods. Further, they make assumptions about data characteristics, such as linearity, normality, and stationarity—that can oversimplify the intricate nature of behavioral phenomena. For example, RQA captures patterns of recurrence and temporal structure in data, offering insights into system behavior that cannot be gleaned from summary statistics alone. Carello and Moreno <a class="citation" href="#carello2005nonlinear">(Carello &amp; Moreno, 2005)</a> argue that non-linear methods ca offer a more nuanced understanding of complex systems:</p> <ol> <li>They capture complex temporal structures that summary statistics miss. Traditional measures like means and variances often fail to reflect the true dynamical properties of behavioral data, particularly when dealing with non-stationary time series where averages change over time.</li> <li>They acknowledge that behavioral variability isn’t merely noise to be eliminated, but rather contains meaningful structure that reveals underlying system dynamics. This variability often shows self-similar patterns across different time scales, suggesting organized rather than random fluctuations.</li> <li>They allow for more nuanced understanding of component interactions. Unlike linear methods that assume simple additive relationships, non-linear analyses can detect subtle interdependencies and complex patterns of coordination between system elements.</li> <li>They make fewer a priori assumptions about the nature of the system being studied. Rather than imposing strict constraints on how components may interact, non-linear methods let the data reveal the true complexity of behavioral organization.</li> </ol> <p>These methods are suited to capturing the rich structure and dynamics inherent in behavioral data offering insights that might be missed by conventional linear approaches.</p> <h2 id="recurrence-quantification-analysis-rqa">Recurrence Quantification Analysis (RQA)</h2> <p>You may ask what is <strong>recurrence</strong>? Recurrence is a fundamental property of dynamical systems, reflecting the tendency of a system to return to a state it has previously visited. But now what is a <strong>state</strong>? A state is a configuration of the system that captures its current condition. For example, in a simple pendulum, the state of the system is defined by the position and velocity of the pendulum. The state of the system evolves over time as the pendulum swings back and forth. Recurrence is the tendency of the pendulum to return to a state it has previously visited, such as when it swings back to the same position and velocity. In a cognitive or a behavioral system, it could comprise patterns of behavior, thoughts, or emotions that recur over time.</p> <p>Recurrence Quantification Analysis (RQA) is a method to quantify the degree of recurrence in a time series (one variable - say veclocity magnitute of a pendulum). It is based on the idea that the dynamics of a system can be captured by the recurrence of states in the phase space. RQA quantifies the recurrence of states in the phase space by measuring the frequency and duration of recurrent states. It provides a way to analyze the temporal structure of a time series and extract information about the underlying dynamics of the system.</p> <p>The motion or the velocity of a pendulum can be represented as a time series data. Let’s generate a simple sine wave to illustrate the concept of RQA.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Generate a sine wave</span><span class="w">
</span><span class="n">set.seed</span><span class="p">(</span><span class="m">123</span><span class="p">)</span><span class="w"> </span><span class="c1"># For reproducibility</span><span class="w">
</span><span class="n">time_points</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">seq</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">10</span><span class="p">,</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.01</span><span class="p">)</span><span class="w">  </span><span class="c1"># Time sequence</span><span class="w">
</span><span class="n">sine_wave</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="nf">sin</span><span class="p">(</span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nb">pi</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">time_points</span><span class="o">*</span><span class="m">0.5</span><span class="p">)</span><span class="w">  </span><span class="c1"># Sine wave: sin(2*pi*w*t)</span><span class="w">
</span><span class="n">cos_wave</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="nf">cos</span><span class="p">(</span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nb">pi</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">time_points</span><span class="p">)</span><span class="w">  </span><span class="c1"># Cosine wave</span><span class="w">

</span><span class="n">sin_cos_df</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">time</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">time_points</span><span class="p">,</span><span class="w"> </span><span class="n">sine_wave</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sine_wave</span><span class="p">,</span><span class="w"> </span><span class="n">cos_wave</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">cos_wave</span><span class="p">)</span><span class="w">

</span><span class="c1"># Plot the sine wave</span><span class="w">
</span><span class="n">ggplot</span><span class="p">(</span><span class="n">sin_cos_df</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">time</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sine_wave</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="w">
    </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Sine Wave"</span><span class="p">,</span><span class="w">
    </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Time"</span><span class="p">,</span><span class="w">
    </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Amplitude"</span><span class="w">
  </span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/sine_wave1-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># plot time point index vs sine wave</span><span class="w">
</span><span class="c1"># Add an index column to the dataframe</span><span class="w">
</span><span class="n">sin_cos_df</span><span class="o">$</span><span class="n">index</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="nf">seq_along</span><span class="p">(</span><span class="n">sin_cos_df</span><span class="o">$</span><span class="n">time</span><span class="p">)</span><span class="w">

</span><span class="c1"># Plot time point index vs sine wave</span><span class="w">
</span><span class="n">ggplot</span><span class="p">(</span><span class="n">sin_cos_df</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">index</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sine_wave</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="w">
    </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Sine Wave with Time Point Index"</span><span class="p">,</span><span class="w">
    </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Index"</span><span class="p">,</span><span class="w">
    </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Amplitude"</span><span class="w">
  </span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">scale_x_continuous</span><span class="p">(</span><span class="n">breaks</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">seq</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="nf">max</span><span class="p">(</span><span class="n">sin_cos_df</span><span class="o">$</span><span class="n">index</span><span class="p">),</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">50</span><span class="p">))</span><span class="w">  </span><span class="c1"># Adjust the 'by' value to control density</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/sine_wave1-2.png" alt=""/></p> <p>The plot above shows a simple sine wave with a frequency of 1 Hz. The sine wave oscillates between -1 and 1 over a period of 10 seconds. This time series data can be analyzed using RQA to quantify the degree of recurrence in the data. Let’s perform RQA on the sine wave data.</p> <h3 id="performing-rqa-on-sine-wave">Performing RQA on Sine Wave</h3> <p>Now, we will perform RQA on the sine wave data to quantify the degree of recurrence in the data. We will use the <code class="language-plaintext highlighter-rouge">rqa</code> function from the <code class="language-plaintext highlighter-rouge">nonlinearTseries</code> package in R to perform RQA. The <code class="language-plaintext highlighter-rouge">rqa</code> function requires the following parameters:</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">embed</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="m">1</span><span class="w">  </span><span class="c1"># Embedding dimension</span><span class="w">
</span><span class="n">delay</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="m">5</span><span class="w">  </span><span class="c1"># Time delay</span><span class="w">

</span><span class="n">rqa_result</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">nonlinearTseries</span><span class="o">::</span><span class="n">rqa</span><span class="p">(</span><span class="n">time.series</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sine_wave</span><span class="p">,</span><span class="w">
                                    </span><span class="n">embedding.dim</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">embed</span><span class="p">,</span><span class="w">
                                    </span><span class="n">time.lag</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">delay</span><span class="p">,</span><span class="w">
                                    </span><span class="n">radius</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.1</span><span class="p">,</span><span class="w">
                                    </span><span class="n">lmin</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w">
                                    </span><span class="n">vmin</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w">
                                    </span><span class="n">do.plot</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/unnamed-chunk-1-1.png" alt=""/></p> <p>The diagonal lines in the recurrence plot represent recurrent states in the phase space, indicating that the system returns to similar states over time or the two time series are similar. The vertical lines represent the duration of recurrent states, indicating how long the system remains in a particular state. Finally, we quantify this plot using measures like determinism, entropy, and percentage of recurrence. We will talk about these measures in detail in subsequent sections.</p> <p><code class="language-plaintext highlighter-rouge">crqa</code> is another popular R package perform Cross-Recurrence Quantification Analysis (CRQA), which is an extension of RQA to analyze the recurrence between two time series (we will talk about this in later section). We can compute RQA using the <code class="language-plaintext highlighter-rouge">crqa</code> function from the <code class="language-plaintext highlighter-rouge">crqa</code> package by providing the same time series twice as input for both <code class="language-plaintext highlighter-rouge">ts1</code> and <code class="language-plaintext highlighter-rouge">ts2</code> arguments.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">embed</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="m">1</span><span class="w">  </span><span class="c1"># Embedding dimension</span><span class="w">
</span><span class="n">delay</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="m">5</span><span class="w">  </span><span class="c1"># Time delay</span><span class="w">

</span><span class="c1"># Perform RQA on the sine wave</span><span class="w">
</span><span class="n">rqa_result</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">crqa</span><span class="o">::</span><span class="n">crqa</span><span class="p">(</span><span class="w">
  </span><span class="n">ts1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sine_wave</span><span class="p">,</span><span class="w">
  </span><span class="n">ts2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sine_wave</span><span class="p">,</span><span class="w">
  </span><span class="n">embed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">embed</span><span class="p">,</span><span class="w">
  </span><span class="n">delay</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">delay</span><span class="p">,</span><span class="w">
  </span><span class="n">radius</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.1</span><span class="p">,</span><span class="w">   </span><span class="c1"># Threshold distance for recurrence</span><span class="w">
  </span><span class="n">normalize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">  </span><span class="c1"># Normalize data</span><span class="w">
  </span><span class="n">mindiagline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of diagonal lines</span><span class="w">
  </span><span class="n">minvertline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of vertical lines</span><span class="w">
  </span><span class="n">whiteline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="w">
</span><span class="p">)</span><span class="w">

</span><span class="c1"># Print RQA results</span><span class="w">
</span><span class="n">print</span><span class="p">(</span><span class="n">rqa_result</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">9</span><span class="p">])</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## $RR
## [1] 18.96315
## 
## $DET
## [1] 99.9779
## 
## $NRLINE
## [1] 9569
## 
## $maxL
## [1] 1001
## 
## $L
## [1] 19.85254
## 
## $ENTR
## [1] 2.901428
## 
## $rENTR
## [1] 0.6416549
## 
## $LAM
## [1] 99.98947
## 
## $TT
## [1] 23.53996
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">crqa</span><span class="o">::</span><span class="n">plot_rp</span><span class="p">(</span><span class="n">rqa_result</span><span class="o">$</span><span class="n">RP</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/rqa1-1.png" alt=""/></p> <p>Notice that the plot made using <code class="language-plaintext highlighter-rouge">nonlinearTseries::rqa</code> has the y-axis flipped compared to the this <code class="language-plaintext highlighter-rouge">crqa::crqa</code> plot. This is because the <code class="language-plaintext highlighter-rouge">rqa</code> function in the <code class="language-plaintext highlighter-rouge">nonlinearTseries</code> package uses a different convention for the recurrence plot. The diagonal lines in the recurrence plot represent recurrent states in the phase space, indicating that the system returns to similar states over time or the two time series are similar. If you flip back the y-axis, you will see that the two recurrence plots are identical.</p> <h2 id="making-sense-of-the-paramters">Making sense of the paramters</h2> <h3 id="embedding-dimension">Embedding dimension</h3> <p>Embedding dimension (<code class="language-plaintext highlighter-rouge">embed</code>): The number of dimensions in the phase space to embed the time series data. It is used to reconstruct the phase space from the time series data. You may be wondering why we need to embed the time series data in a higher-dimensional space. This is because the dynamics of the system are often governed by multiple variables or factors that interact with each other. Embedding the time series data in a higher-dimensional space allows us to capture these interactions and reconstruct the underlying dynamics of the system.</p> <p><strong>Departure from one-dimensional analysis</strong> <a class="citation" href="#webber2005recurrence">(Webber Jr &amp; Zbilut, 2005)</a> provides a comprehensive overview of RQA, and I highly recommend reading it. Then analysis of a single time series data as shown above is inherently one-dimensional, the measured quantity itself being a single variable. However, <a class="citation" href="#wallot2018analyzing">(Wallot &amp; Leonardi, 2018)</a> argue that many behavioral phenomena are inherently multi-dimensional. For example (borrowed from <a class="citation" href="#wallot2019multidimensional">(Wallot, 2019)</a>), consider the Electrocardiogram (ECG) data, which captures the electrical activity of the heart using two electrodes placed on the chest. This setup is essentailly capturing data in one plane (i.e. instead of the three bodily planes: frontal, sagittal, and transverse), and the inherent ECG data is an aggregation of multiple sources at its core.</p> <p>One question that arises now is, how do we capture the multi-dimensional nature of the data? One way to do this is to use the concept of time-delayed embedding. <strong>Takens theorem</strong> states that a time-delayed embedding of a time series can recover the underlying dynamics of the system. The theorem states that a time series can be reconstructed from a single variable by using a time-delayed embedding. The time-delayed embedding is a method to reconstruct the phase space of a dynamical system from a single time series.</p> <p>We can compute the embedding dimension using the false nearest neighbors method. The false nearest neighbors method involves iteratively increasing the embedding dimension and calculating the fraction of false nearest neighbors to determine the optimal embedding dimension. The optimal embedding dimension is the smallest embedding dimension that captures the underlying dynamics of the system.</p> <h3 id="time-delay">Time delay</h3> <p>Time delay (<code class="language-plaintext highlighter-rouge">delay</code>): The time delay between the embedded time series data points. It is used to reconstruct the phase space from the time series data. The time delay is crucial for capturing the dynamics of the system and is often determined using the autocorrelation function of the time series data. The time delay helps in capturing the temporal structure of the system and is essential for reconstructing the phase space from the time series data. Autocorrelation is a measure of the correlation between the time series data and its lagged values.</p> <h3 id="radius">Radius</h3> <p>Radius (<code class="language-plaintext highlighter-rouge">radius</code>): The threshold distance for recurrence. Maximum distance between two phase-space points to be considered a recurrence. In simple terms, it defines how close two points in the phase space need to be to be considered recurrent.</p> <h2 id="making-sense-of-the-metrics">Making sense of the metrics</h2> <p>Recurrence quantification analysis (RQA) provides several metrics to quantify the degree of recurrence in the data. It is important to understand what they represent and how they can be interpreted.</p> <h3 id="recurrence-rate">Recurrence rate</h3> <p>Recurrence rate (<code class="language-plaintext highlighter-rouge">RR</code>): The percentage of recurrent points in the phase space within a certain threshold distance. It quantifies the density of recurrence points in a recurrence plot. In other words, it measures how often states in a dynamical system recur over time.</p> <p>In real sense, the recurrence rate can be altered by changing the threshold distance for recurrence. A higher threshold distance will result in a lower recurrence rate, as the system needs to be closer to be considered recurrent.</p> <p><strong>What does a high recurrence rate mean in physiological data?</strong> A high determinism in physiological data indicates that the system exhibits predictable and structured behavior, with recurring patterns that are sustained over time. This suggests a degree of regularity or coordination in the underlying physiological processes.</p> <h3 id="determinism">Determinism</h3> <p>Determinism (<code class="language-plaintext highlighter-rouge">DET</code>): The percentage of recurrent points that form diagonal lines in the recurrence plot. It represents the predictability or regularity of the system. A higher determinism indicates that the system returns to similar states over time and exhibits regular behavior, while a lower determinism indicates that the system explores different states and exhibits irregular behavior. Determinism is a measure of the predictability or regularity of the system.</p> <p>Determinism (DET) goes a step further - it tells you about the predictability and structure in your system: - DET measures the percentage of recurrent points that form diagonal lines in your recurrence plot - These diagonal lines are crucial because they indicate that segments of your trajectory are running parallel to other segments i.e. the system is revisiting similar states over time</p> <p><strong>What does a high determinism mean in physiological data?</strong> A high determinism in physiological data indicates that the system exhibits predictable and structured behavior, with recurring patterns that are sustained over time. This suggests a degree of regularity or coordination in the underlying physiological processes.</p> <h3 id="entropy">Entropy</h3> <p>Entropy (<code class="language-plaintext highlighter-rouge">ENT</code>): The Shannon entropy of the diagonal line lengths in the recurrence plot. It quantifies the complexity or randomness of the system. A higher entropy indicates that the system explores a wide range of states and exhibits complex behavior, while a lower entropy indicates that the system returns to similar states and exhibits regular behavior. Entropy is a measure of the complexity or randomness of the system.</p> <h2 id="embedding-dimension-greater-than-1">Embedding dimension greater than 1</h2> <p>We will now look at another example - Lorenz attractor - to demonstrate the importance of embedding dimension greater than 1. This is a set of three coupled differential equations that describe the trajectory of a particle moving in a three-dimensional space with equations:</p> <p>\begin{equation} \frac{dx}{dt} = \sigma(y - x) \end{equation}</p> <p>\begin{equation} \frac{dy}{dt} = x(\rho - z) - y \end{equation}</p> <p>\begin{equation} \frac{dz}{dt} = xy - \beta z \end{equation}</p> <p>where \(x\), \(y\), and \(z\) are the state variables, and \(\sigma\), \(\rho\), and \(\beta\) are the system parameters. The Lorenz attractor exhibits chaotic behavior, and the trajectory of the particle in the phase space is sensitive to the initial conditions.</p> <p>We will generate the time series data for the Lorenz attractor and perform RQA on it. We will assume that we are only able to observe the \(x\) variable and use it as the time series data. We will embed the time series data in a three-dimensional space (or more) and perform RQA on it.</p> <p>This example is borrowed from <a class="citation" href="#wallot2018analyzing">(Wallot &amp; Leonardi, 2018)</a>, an excellent resource for understanding the application of CRQA.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">lorData</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">nonlinearTseries</span><span class="o">::</span><span class="n">lorenz</span><span class="p">(</span><span class="n">time</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">seq</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="m">20</span><span class="p">,</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.02</span><span class="p">),</span><span class="w"> </span><span class="n">do.plot</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">)</span><span class="w">
</span><span class="c1"># head(lorData)</span><span class="w">

</span><span class="n">lorData</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">as.data.frame</span><span class="p">(</span><span class="n">lorData</span><span class="p">)</span><span class="w">

</span><span class="n">ggplot</span><span class="p">(</span><span class="n">lorData</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">z</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">z</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_path</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Lorenz Attractor"</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"x"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"y"</span><span class="p">,</span><span class="w"> </span><span class="n">z</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"z"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_bw</span><span class="p">()</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/lorenz-rqa1-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Plotly plot</span><span class="w">
</span><span class="n">p</span><span class="o">&lt;-</span><span class="w"> </span><span class="n">plot_ly</span><span class="p">(</span><span class="n">lorData</span><span class="p">,</span><span class="w"> 
        </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">x</span><span class="p">,</span><span class="w"> 
        </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">y</span><span class="p">,</span><span class="w"> 
        </span><span class="n">z</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">z</span><span class="p">,</span><span class="w"> 
        </span><span class="n">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'scatter3d'</span><span class="p">,</span><span class="w"> 
        </span><span class="n">mode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'lines'</span><span class="p">,</span><span class="w">
        </span><span class="n">line</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">width</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">z</span><span class="p">,</span><span class="w"> </span><span class="n">colorscale</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'Viridis'</span><span class="p">))</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">layout</span><span class="p">(</span><span class="w">
    </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Lorenz Attractor"</span><span class="p">,</span><span class="w">
    </span><span class="n">scene</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="w">
      </span><span class="n">camera</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="w">
        </span><span class="n">eye</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1.5</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1.5</span><span class="p">,</span><span class="w"> </span><span class="n">z</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1.5</span><span class="p">)</span><span class="w">
      </span><span class="p">),</span><span class="w">
      </span><span class="n">aspectmode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"data"</span><span class="p">,</span><span class="w">
      </span><span class="n">xaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"X"</span><span class="p">),</span><span class="w">
      </span><span class="n">yaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Y"</span><span class="p">),</span><span class="w">
      </span><span class="n">zaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Z"</span><span class="p">)</span><span class="w">
    </span><span class="p">)</span><span class="w">
  </span><span class="p">)</span><span class="w">

</span><span class="n">p</span><span class="w">
</span></code></pre></div></div> <iframe width="100%" height="400" src="/assets/plotly/lorenz3d.html" frameborder="0"></iframe> <h3 id="average-mutual-information">Average mutual information</h3> <p>The average mutual information (AMI) captures the amount of information shared between two points in the phase space as a function of the <strong>time delay</strong>. The AMI is used to determine the optimal time delay for a time series data. The optimal time delay is the time delay that captures the underlying dynamics of the system and is often determined using the first local minimum of the AMI curve.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Compute AMI</span><span class="w">
</span><span class="n">ami_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">nonlinearTseries</span><span class="o">::</span><span class="n">mutualInformation</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">,</span><span class="w"> 
                                                  </span><span class="n">lag.max</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">50</span><span class="p">,</span><span class="w">
                                                  </span><span class="n">do.plot</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/lorenz_ami-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">amis</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ami_values</span><span class="o">$</span><span class="n">mutual.information</span><span class="w"> 
</span><span class="c1"># plot(1:51, amis, type = "l", xlab = "Lag", ylab = "AMI", main = "Lag vs AMI")</span><span class="w">

</span><span class="c1"># fist local minima</span><span class="w">
</span><span class="n">delay</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">which</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="nf">sign</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="n">amis</span><span class="p">)))</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="m">2</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">1</span><span class="w">
</span><span class="n">delay</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">delay</span><span class="p">[</span><span class="m">1</span><span class="p">]</span><span class="w">
</span><span class="n">delay</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## [1] 10
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># can also be computed using tseriesChaos package</span><span class="w">
</span><span class="n">ami_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">tseriesChaos</span><span class="o">::</span><span class="n">mutual</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">lag.max</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">50</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/lorenz_ami-2.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">min_index</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">which.min</span><span class="p">(</span><span class="n">ami_values</span><span class="p">)</span><span class="w">
</span><span class="n">min_value</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ami_values</span><span class="p">[</span><span class="n">min_index</span><span class="p">]</span><span class="w">
</span><span class="n">min_index</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## 46 
## 47
</code></pre></div></div> <h3 id="false-nearest-neighbors">False Nearest Neighbors</h3> <p>The false nearest neighbors (FNN) method is used to determine the minimum <strong>embedding dimension</strong> for a time series data. The FNN method involves iteratively increasing the embedding dimension and calculating the fraction of false nearest neighbors to determine the optimal embedding dimension. The optimal embedding dimension is the smallest embedding dimension that captures the underlying dynamics of the system. It is advised to overestimate the time delay to avoid missing the underlying dynamics of the system.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Compute FNN</span><span class="w">
</span><span class="n">fnn_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">tseriesChaos</span><span class="o">::</span><span class="n">false.nearest</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">,</span><span class="w"> 
                                         </span><span class="n">m</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">20</span><span class="p">,</span><span class="w"> </span><span class="c1"># maximum embedding dimension (search param)</span><span class="w">
                                         </span><span class="n">d</span><span class="o">=</span><span class="n">delay</span><span class="p">,</span><span class="w"> </span><span class="c1"># time delay</span><span class="w">
                                         </span><span class="n">t</span><span class="o">=</span><span class="m">0</span><span class="w"> </span><span class="c1"># theiler window</span><span class="w">
                                         </span><span class="p">)</span><span class="w">

</span><span class="c1"># plot the FNN values</span><span class="w">
</span><span class="n">tseriesChaos</span><span class="o">::</span><span class="n">plot.false.nearest</span><span class="p">(</span><span class="n">fnn_values</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/lorenz_fnn-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Extract the 'fraction' row</span><span class="w">
</span><span class="n">fraction_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">fnn_values</span><span class="p">[</span><span class="s2">"fraction"</span><span class="p">,</span><span class="w"> </span><span class="p">]</span><span class="w">

</span><span class="c1"># Convert to numeric, removing empty values (if any)</span><span class="w">
</span><span class="n">fraction_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">fraction_values</span><span class="p">[</span><span class="o">!</span><span class="nf">is.na</span><span class="p">(</span><span class="n">fraction_values</span><span class="p">)])</span><span class="w">

</span><span class="c1"># Find the first local minimum index in the FNN values</span><span class="w">
</span><span class="n">embed_dim</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">which</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="nf">sign</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="n">fraction_values</span><span class="p">)))</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">1</span><span class="w">
</span><span class="n">embed_dim</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">embed_dim</span><span class="p">[</span><span class="m">1</span><span class="p">]</span><span class="w">
</span><span class="n">embed_dim</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## [1] 4
</code></pre></div></div> <p>The minima is at embedding dimension 4 but since the curve flattens out at 3 and for easy visualization, we will use embedding dimension 3 (ideally 4 - the higher should be chosen when in doubt).</p> <p>On a side note, Theiler window is a temporal window that limits the number of nearby data points that are used in the False Nearest Neighbors (FNN) analysis. The purpose of this window is to ensure that when you’re checking for false nearest neighbors, you avoid using data points that are too close in time to one another, as these could be highly autocorrelated and therefore not valid for the analysis. Essentially, it’s a way to exclude points that are too similar due to being in close proximity in time, and this helps to improve the robustness of the analysis.</p> <h3 id="phase-space-reconstruction">Phase Space Reconstruction</h3> <p>Phase-space reconstruction through 3-D embedding of the individual time-series</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># define x1, x2, x3 using delay</span><span class="w">
</span><span class="n">x1</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">delay</span><span class="p">)]</span><span class="w">  </span><span class="c1"># x(t)</span><span class="w">
</span><span class="n">x2</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">[(</span><span class="m">1</span><span class="w"> </span><span class="o">+</span><span class="w">  </span><span class="n">delay</span><span class="p">)</span><span class="o">:</span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w">  </span><span class="n">delay</span><span class="p">)]</span><span class="w">  </span><span class="c1"># x(t + delay)</span><span class="w">
</span><span class="n">x3</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">[(</span><span class="m">1</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">delay</span><span class="p">)</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">)]</span><span class="w">  </span><span class="c1"># x(t + 2 x delay)</span><span class="w">

</span><span class="n">new_data</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">x1</span><span class="o">=</span><span class="n">x1</span><span class="p">,</span><span class="w">
                       </span><span class="n">x2</span><span class="o">=</span><span class="n">x2</span><span class="p">,</span><span class="w">
                       </span><span class="n">x3</span><span class="o">=</span><span class="n">x3</span><span class="p">)</span><span class="w">

</span><span class="n">ggplot</span><span class="p">(</span><span class="n">new_data</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x1</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x2</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_path</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">NULL</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"x(t)"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"x(t + 10)"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/lorenz_psr-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">p</span><span class="o">&lt;-</span><span class="w"> </span><span class="n">plot_ly</span><span class="p">(</span><span class="n">new_data</span><span class="p">,</span><span class="w"> 
        </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">x1</span><span class="p">,</span><span class="w"> 
        </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">x2</span><span class="p">,</span><span class="w"> 
        </span><span class="n">z</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">x3</span><span class="p">,</span><span class="w"> 
        </span><span class="n">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'scatter3d'</span><span class="p">,</span><span class="w"> 
        </span><span class="n">mode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'lines'</span><span class="p">,</span><span class="w">
        </span><span class="n">line</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">width</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">x3</span><span class="p">,</span><span class="w"> </span><span class="n">colorscale</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'Viridis'</span><span class="p">))</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">layout</span><span class="p">(</span><span class="w">
    </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Lorenz Attractor"</span><span class="p">,</span><span class="w">
    </span><span class="n">scene</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="w">
      </span><span class="n">camera</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="w">
        </span><span class="n">eye</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1.5</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1.5</span><span class="p">,</span><span class="w"> </span><span class="n">z</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1.5</span><span class="p">)</span><span class="w">
      </span><span class="p">),</span><span class="w">
      </span><span class="n">aspectmode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"data"</span><span class="p">,</span><span class="w">
      </span><span class="n">xaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"X"</span><span class="p">),</span><span class="w">
      </span><span class="n">yaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Y"</span><span class="p">),</span><span class="w">
      </span><span class="n">zaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Z"</span><span class="p">)</span><span class="w">
    </span><span class="p">)</span><span class="w">
  </span><span class="p">)</span><span class="w">

</span><span class="n">p</span><span class="w">
</span></code></pre></div></div> <iframe width="100%" height="400" src="/assets/plotly/lorenz3d_psr.html" frameborder="0"></iframe> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># embed using lorData$x</span><span class="w">
</span><span class="n">x1</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">delay</span><span class="p">)]</span><span class="w">
</span><span class="n">x2</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">[(</span><span class="m">1</span><span class="w"> </span><span class="o">+</span><span class="w">  </span><span class="n">delay</span><span class="p">)</span><span class="o">:</span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w">  </span><span class="n">delay</span><span class="p">)]</span><span class="w">
</span><span class="n">x3</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">[(</span><span class="m">1</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">delay</span><span class="p">)</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">)]</span><span class="w">

</span><span class="c1"># embed using lorData$y</span><span class="w">
</span><span class="n">y1</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">y</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">delay</span><span class="p">)]</span><span class="w">
</span><span class="n">y2</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">y</span><span class="p">[(</span><span class="m">1</span><span class="w"> </span><span class="o">+</span><span class="w">  </span><span class="n">delay</span><span class="p">)</span><span class="o">:</span><span class="p">(</span><span class="nf">length</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">y</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w">  </span><span class="n">delay</span><span class="p">)]</span><span class="w">
</span><span class="n">y3</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">y</span><span class="p">[(</span><span class="m">1</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">delay</span><span class="p">)</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">lorData</span><span class="o">$</span><span class="n">y</span><span class="p">)]</span><span class="w">

</span><span class="n">new_data</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">x1</span><span class="o">=</span><span class="n">x1</span><span class="p">,</span><span class="w">
                       </span><span class="n">x2</span><span class="o">=</span><span class="n">x2</span><span class="p">,</span><span class="w">
                       </span><span class="n">x3</span><span class="o">=</span><span class="n">x3</span><span class="p">,</span><span class="w">
                       </span><span class="n">y1</span><span class="o">=</span><span class="n">y1</span><span class="p">,</span><span class="w">
                       </span><span class="n">y2</span><span class="o">=</span><span class="n">y2</span><span class="p">,</span><span class="w">
                       </span><span class="n">y3</span><span class="o">=</span><span class="n">y3</span><span class="p">)</span><span class="w">

</span><span class="c1"># Define the two traces for xx, xy, xz</span><span class="w">
</span><span class="n">trace1</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">plot_ly</span><span class="p">(</span><span class="w">
  </span><span class="n">new_data</span><span class="p">,</span><span class="w">
  </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">x1</span><span class="p">,</span><span class="w">
  </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">x2</span><span class="p">,</span><span class="w">
  </span><span class="n">z</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">x3</span><span class="p">,</span><span class="w">
  </span><span class="n">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'scatter3d'</span><span class="p">,</span><span class="w">
  </span><span class="n">mode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'lines'</span><span class="p">,</span><span class="w">
  </span><span class="n">line</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">width</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'blue'</span><span class="p">)</span><span class="w">
</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">layout</span><span class="p">(</span><span class="n">scene</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="w">
    </span><span class="n">xaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"X(t)"</span><span class="p">),</span><span class="w">
    </span><span class="n">yaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"X(t + 10)"</span><span class="p">),</span><span class="w">
    </span><span class="n">zaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"X(t + 19)"</span><span class="p">),</span><span class="w">
    </span><span class="n">aspectmode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"data"</span><span class="w">
  </span><span class="p">))</span><span class="w">

</span><span class="c1"># Define the two traces for yx, yy, yz</span><span class="w">
</span><span class="n">trace2</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">plot_ly</span><span class="p">(</span><span class="w">
  </span><span class="n">new_data</span><span class="p">,</span><span class="w">
  </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">y1</span><span class="p">,</span><span class="w">
  </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">y2</span><span class="p">,</span><span class="w">
  </span><span class="n">z</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="n">y3</span><span class="p">,</span><span class="w">
  </span><span class="n">type</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'scatter3d'</span><span class="p">,</span><span class="w">
  </span><span class="n">mode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'lines'</span><span class="p">,</span><span class="w">
  </span><span class="n">line</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">width</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'red'</span><span class="p">)</span><span class="w">
</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">layout</span><span class="p">(</span><span class="n">scene</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="w">
    </span><span class="n">xaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Y(t)"</span><span class="p">),</span><span class="w">
    </span><span class="n">yaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Y(t + 10)"</span><span class="p">),</span><span class="w">
    </span><span class="n">zaxis</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">list</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Y(t + 19)"</span><span class="p">),</span><span class="w">
    </span><span class="n">aspectmode</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"data"</span><span class="w">
  </span><span class="p">))</span><span class="w">

</span><span class="c1"># Combine the plots into one visualization using subplot</span><span class="w">
</span><span class="n">p</span><span class="o">&lt;-</span><span class="n">subplot</span><span class="p">(</span><span class="w">
  </span><span class="n">trace1</span><span class="p">,</span><span class="w"> </span><span class="n">trace2</span><span class="p">,</span><span class="w">
  </span><span class="n">nrows</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">  </span><span class="c1"># Arrange plots in one row</span><span class="w">
  </span><span class="n">titleX</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="n">titleY</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">,</span><span class="w"> </span><span class="n">shareY</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="w">
</span><span class="p">)</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">layout</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Two Delayed Plots"</span><span class="p">)</span><span class="w">

</span><span class="n">p</span><span class="w">
</span></code></pre></div></div> <iframe width="100%" height="400" src="/assets/plotly/lorenz3d_psr2.html" frameborder="0"></iframe> <h3 id="rqa-on-lorenz-attractor">RQA on Lorenz Attractor</h3> <p>RQA with multidimensional embedding</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Perform RQA on the Lorenz attractor</span><span class="w">
</span><span class="n">rqa_result</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">crqa</span><span class="o">::</span><span class="n">crqa</span><span class="p">(</span><span class="w">
  </span><span class="n">ts1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">,</span><span class="w">
  </span><span class="n">ts2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">,</span><span class="w">
  </span><span class="n">embed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">embed_dim</span><span class="p">,</span><span class="w">
  </span><span class="n">delay</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">delay</span><span class="p">,</span><span class="w">
  </span><span class="n">radius</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.1</span><span class="p">,</span><span class="w">   </span><span class="c1"># Threshold distance for recurrence</span><span class="w">
  </span><span class="n">normalize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">  </span><span class="c1"># Normalize data</span><span class="w">
  </span><span class="n">mindiagline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of diagonal lines</span><span class="w">
  </span><span class="n">minvertline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of vertical lines</span><span class="w">
  </span><span class="n">tw</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w">         </span><span class="c1"># Theiler window</span><span class="w">
  </span><span class="n">whiteline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w">
  </span><span class="n">side</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"both"</span><span class="w">
</span><span class="p">)</span><span class="w">

</span><span class="c1"># Print RQA results</span><span class="w">
</span><span class="n">print</span><span class="p">(</span><span class="n">rqa_result</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">9</span><span class="p">])</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## $RR
## [1] 2.348752
## 
## $DET
## [1] 99.06977
## 
## $NRLINE
## [1] 1741
## 
## $maxL
## [1] 971
## 
## $L
## [1] 12.60138
## 
## $ENTR
## [1] 2.882162
## 
## $rENTR
## [1] 0.6761384
## 
## $LAM
## [1] 98.60917
## 
## $TT
## [1] 4.086265
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># plot</span><span class="w">
</span><span class="n">crqa</span><span class="o">::</span><span class="n">plot_rp</span><span class="p">(</span><span class="n">rqa_result</span><span class="o">$</span><span class="n">RP</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/lorenz_rqa-1.png" alt=""/></p> <h3 id="crqa-on-lorenz-attractor">CRQA on Lorenz Attractor</h3> <p>Notice we only change the <code class="language-plaintext highlighter-rouge">ts2</code> to <code class="language-plaintext highlighter-rouge">lorData$y</code> to perform CRQA</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Perform RQA on the Lorenz attractor</span><span class="w">
</span><span class="n">rqa_result</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">crqa</span><span class="o">::</span><span class="n">crqa</span><span class="p">(</span><span class="w">
  </span><span class="n">ts1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">x</span><span class="p">,</span><span class="w">
  </span><span class="n">ts2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">lorData</span><span class="o">$</span><span class="n">y</span><span class="p">,</span><span class="w">
  </span><span class="n">embed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">embed_dim</span><span class="p">,</span><span class="w">
  </span><span class="n">delay</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">delay</span><span class="p">,</span><span class="w">
  </span><span class="n">radius</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.1</span><span class="p">,</span><span class="w">   </span><span class="c1"># Threshold distance for recurrence</span><span class="w">
  </span><span class="n">normalize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">  </span><span class="c1"># Normalize data</span><span class="w">
  </span><span class="n">mindiagline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of diagonal lines</span><span class="w">
  </span><span class="n">minvertline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of vertical lines</span><span class="w">
  </span><span class="n">tw</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w">         </span><span class="c1"># Theiler window</span><span class="w">
  </span><span class="n">whiteline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w">
  </span><span class="n">side</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"both"</span><span class="w">
</span><span class="p">)</span><span class="w">

</span><span class="c1"># Print RQA results</span><span class="w">
</span><span class="n">print</span><span class="p">(</span><span class="n">rqa_result</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">9</span><span class="p">])</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## $RR
## [1] 0.6662841
## 
## $DET
## [1] 96.19548
## 
## $NRLINE
## [1] 1200
## 
## $maxL
## [1] 38
## 
## $L
## [1] 5.035833
## 
## $ENTR
## [1] 2.178686
## 
## $rENTR
## [1] 0.634448
## 
## $LAM
## [1] 87.88602
## 
## $TT
## [1] 2.8154
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># plot</span><span class="w">
</span><span class="n">crqa</span><span class="o">::</span><span class="n">plot_rp</span><span class="p">(</span><span class="n">rqa_result</span><span class="o">$</span><span class="n">RP</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/lorenz_crqa-1.png" alt=""/></p> <h2 id="electrocardiogram-ecg-data-analysis">Electrocardiogram (ECG) Data Analysis</h2> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># load csv file with headers</span><span class="w">
</span><span class="n">ecg_data</span><span class="w"> </span><span class="o">&lt;-</span><span class="w">  </span><span class="n">read_csv</span><span class="p">(</span><span class="s2">"ecg.csv"</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## New names:
## Rows: 184771 Columns: 10
## ── Column specification
## ──────────────────────────────────────────────────────── Delimiter: "," chr
## (3): eid, p1_id, p2_id dbl (6): ...1, p1_ecg, p2_ecg, trial, condition,
## local_trial dttm (1): timestamp
## ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
## Specify the column types or set `show_col_types = FALSE` to quiet this message.
## • `` -&gt; `...1`
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># convert timestamp to POSIXct</span><span class="w">
</span><span class="n">ecg_data</span><span class="o">$</span><span class="n">timestamp</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">as.POSIXct</span><span class="p">(</span><span class="n">ecg_data</span><span class="o">$</span><span class="n">timestamp</span><span class="p">,</span><span class="w"> </span><span class="n">format</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"%Y-%m-%d %H:%M:%OS"</span><span class="p">)</span><span class="w">

</span><span class="c1"># Convert data to a tsibble object</span><span class="w">
</span><span class="n">ecg_tsibble</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ecg_data</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">as_tsibble</span><span class="p">(</span><span class="n">index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">timestamp</span><span class="p">)</span><span class="w">

</span><span class="c1"># Resample to 100 Hz using interval = "0.01 secs"</span><span class="w">
</span><span class="n">ecg_resampled</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ecg_tsibble</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">index_by</span><span class="p">(</span><span class="n">new_time</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">floor_date</span><span class="p">(</span><span class="n">.x</span><span class="p">,</span><span class="w"> </span><span class="s2">"0.01 secs"</span><span class="p">))</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w"> </span><span class="c1"># Adjust timestamp to nearest 10ms</span><span class="w">
  </span><span class="n">summarize</span><span class="p">(</span><span class="n">p1_ecg</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">mean</span><span class="p">(</span><span class="n">p1_ecg</span><span class="p">,</span><span class="w"> </span><span class="n">na.rm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">),</span><span class="w">
            </span><span class="n">p2_ecg</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">mean</span><span class="p">(</span><span class="n">p2_ecg</span><span class="p">,</span><span class="w"> </span><span class="n">na.rm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">)</span><span class="w">
            </span><span class="p">)</span><span class="w"> </span><span class="c1"># Use mean for downsampling</span><span class="w">

</span><span class="c1"># subset data</span><span class="w">
</span><span class="n">ecg_resampled</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ecg_resampled</span><span class="p">[</span><span class="m">4000</span><span class="o">:</span><span class="m">10000</span><span class="p">,]</span><span class="w">

</span><span class="c1"># plot p1_ecg</span><span class="w">
</span><span class="n">p1</span><span class="o">&lt;-</span><span class="n">ecg_resampled</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">new_time</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">p1_ecg</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"P1 ECG"</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Time"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"ECG"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_bw</span><span class="p">()</span><span class="w">

</span><span class="c1"># plot p1_ecg</span><span class="w">
</span><span class="n">p2</span><span class="o">&lt;-</span><span class="n">ecg_resampled</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">new_time</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">p2_ecg</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"P1 ECG"</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Time"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"ECG"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_bw</span><span class="p">()</span><span class="w">


</span><span class="n">p1</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">p2</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/ecg_data-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Add an index column to the data for plotting</span><span class="w">
</span><span class="n">ecg_resampled_</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ecg_resampled</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">mutate</span><span class="p">(</span><span class="n">index</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">row_number</span><span class="p">())</span><span class="w">


</span><span class="c1"># Plot using the new index</span><span class="w">
</span><span class="n">p1_</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ecg_resampled_</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">index</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">p1_ecg</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"P1 ECG"</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Index"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"ECG"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_bw</span><span class="p">()</span><span class="w">

</span><span class="n">p2_</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ecg_resampled_</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">index</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">p2_ecg</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"P1 ECG"</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Index"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"ECG"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_bw</span><span class="p">()</span><span class="w">

</span><span class="n">p1_</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">p2_</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/ecg_data-2.png" alt=""/></p> <h3 id="ami-on-ecg-data">AMI on ECG Data</h3> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Compute AMI</span><span class="w">
</span><span class="n">ami_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">nonlinearTseries</span><span class="o">::</span><span class="n">mutualInformation</span><span class="p">(</span><span class="n">ecg_resampled</span><span class="o">$</span><span class="n">p2_ecg</span><span class="p">,</span><span class="w">
                                                  </span><span class="n">lag.max</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">50</span><span class="p">,</span><span class="w">
                                                  </span><span class="n">do.plot</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/ecg_ami-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">amis</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ami_values</span><span class="o">$</span><span class="n">mutual.information</span><span class="w"> 
</span><span class="c1"># plot(1:21, amis, type = "l", xlab = "Lag", ylab = "AMI", main = "AMI vs Lag")</span><span class="w">

</span><span class="c1"># fist local minima</span><span class="w">
</span><span class="n">delay</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">which</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="nf">sign</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="n">amis</span><span class="p">)))</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="m">2</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">1</span><span class="w">
</span><span class="n">delay</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">delay</span><span class="p">[</span><span class="m">1</span><span class="p">]</span><span class="w">
</span><span class="n">delay</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## [1] 18
</code></pre></div></div> <h3 id="fnn-on-ecg-data">FNN on ECG Data</h3> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Compute FNN</span><span class="w">
</span><span class="n">fnn_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">tseriesChaos</span><span class="o">::</span><span class="n">false.nearest</span><span class="p">(</span><span class="n">ecg_resampled</span><span class="o">$</span><span class="n">p2_ecg</span><span class="p">,</span><span class="w">
                                         </span><span class="n">m</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">10</span><span class="p">,</span><span class="w">
                                         </span><span class="n">d</span><span class="o">=</span><span class="n">delay</span><span class="p">,</span><span class="w">
                                         </span><span class="n">t</span><span class="o">=</span><span class="m">1</span><span class="w">
                                         </span><span class="p">)</span><span class="w">

</span><span class="c1"># plot the FNN values</span><span class="w">
</span><span class="n">tseriesChaos</span><span class="o">::</span><span class="n">plot.false.nearest</span><span class="p">(</span><span class="n">fnn_values</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/ecg_fnn-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Extract the 'fraction' row</span><span class="w">
</span><span class="n">fraction_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">fnn_values</span><span class="p">[</span><span class="s2">"fraction"</span><span class="p">,</span><span class="w"> </span><span class="p">]</span><span class="w">

</span><span class="c1"># Convert to numeric, removing empty values (if any)</span><span class="w">
</span><span class="n">fraction_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">fraction_values</span><span class="p">[</span><span class="o">!</span><span class="nf">is.na</span><span class="p">(</span><span class="n">fraction_values</span><span class="p">)])</span><span class="w">

</span><span class="c1"># Find the first local minimum index in the FNN values</span><span class="w">
</span><span class="n">embed_dim</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">which</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="nf">sign</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="n">fraction_values</span><span class="p">)))</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">1</span><span class="w">
</span><span class="n">embed_dim</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">embed_dim</span><span class="p">[</span><span class="m">1</span><span class="p">]</span><span class="w">
</span><span class="n">embed_dim</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## [1] 5
</code></pre></div></div> <h3 id="rqa-on-ecg-data">RQA on ECG Data</h3> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Perform RQA on the ecg data</span><span class="w">
</span><span class="n">rqa_result</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">crqa</span><span class="o">::</span><span class="n">crqa</span><span class="p">(</span><span class="w">
  </span><span class="n">ts1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ecg_resampled</span><span class="o">$</span><span class="n">p1_ecg</span><span class="p">,</span><span class="w">
  </span><span class="n">ts2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ecg_resampled</span><span class="o">$</span><span class="n">p1_ecg</span><span class="p">,</span><span class="w">
  </span><span class="n">embed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">embed_dim</span><span class="p">,</span><span class="w">
  </span><span class="n">delay</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">delay</span><span class="p">,</span><span class="w">
  </span><span class="n">radius</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.1</span><span class="p">,</span><span class="w">   </span><span class="c1"># Threshold distance for recurrence</span><span class="w">
  </span><span class="n">normalize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">  </span><span class="c1"># Normalize data</span><span class="w">
  </span><span class="n">mindiagline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of diagonal lines</span><span class="w">
  </span><span class="n">minvertline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of vertical lines</span><span class="w">
  </span><span class="n">tw</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w">         </span><span class="c1"># Theiler window</span><span class="w">
  </span><span class="n">whiteline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w">
  </span><span class="n">side</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"both"</span><span class="w">
</span><span class="p">)</span><span class="w">

</span><span class="n">print</span><span class="p">(</span><span class="n">rqa_result</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">9</span><span class="p">])</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## $RR
## [1] 16.68069
## 
## $DET
## [1] 92.8737
## 
## $NRLINE
## [1] 1173703
## 
## $maxL
## [1] 5929
## 
## $L
## [1] 4.639929
## 
## $ENTR
## [1] 2.11479
## 
## $rENTR
## [1] 0.4402126
## 
## $LAM
## [1] 95.89235
## 
## $TT
## [1] 6.376579
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># plot</span><span class="w">
</span><span class="n">crqa</span><span class="o">::</span><span class="n">plot_rp</span><span class="p">(</span><span class="n">rqa_result</span><span class="o">$</span><span class="n">RP</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/ecg_rqa2-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Perform RQA on the ecg data</span><span class="w">
</span><span class="n">rqa_result</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">crqa</span><span class="o">::</span><span class="n">crqa</span><span class="p">(</span><span class="w">
  </span><span class="n">ts1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ecg_resampled</span><span class="o">$</span><span class="n">p2_ecg</span><span class="p">,</span><span class="w">
  </span><span class="n">ts2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ecg_resampled</span><span class="o">$</span><span class="n">p2_ecg</span><span class="p">,</span><span class="w">
  </span><span class="n">embed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">embed_dim</span><span class="p">,</span><span class="w">
  </span><span class="n">delay</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">delay</span><span class="p">,</span><span class="w">
  </span><span class="n">radius</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.1</span><span class="p">,</span><span class="w">   </span><span class="c1"># Threshold distance for recurrence</span><span class="w">
  </span><span class="n">normalize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">  </span><span class="c1"># Normalize data</span><span class="w">
  </span><span class="n">mindiagline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of diagonal lines</span><span class="w">
  </span><span class="n">minvertline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of vertical lines</span><span class="w">
  </span><span class="n">tw</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w">         </span><span class="c1"># Theiler window</span><span class="w">
  </span><span class="n">whiteline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w">
  </span><span class="n">side</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"both"</span><span class="w">
</span><span class="p">)</span><span class="w">

</span><span class="n">print</span><span class="p">(</span><span class="n">rqa_result</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">9</span><span class="p">])</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## $RR
## [1] 6.827498
## 
## $DET
## [1] 86.51108
## 
## $NRLINE
## [1] 557649
## 
## $maxL
## [1] 5929
## 
## $L
## [1] 3.723362
## 
## $ENTR
## [1] 1.716389
## 
## $rENTR
## [1] 0.3894937
## 
## $LAM
## [1] 93.21525
## 
## $TT
## [1] 4.523905
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># plot</span><span class="w">
</span><span class="n">crqa</span><span class="o">::</span><span class="n">plot_rp</span><span class="p">(</span><span class="n">rqa_result</span><span class="o">$</span><span class="n">RP</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/ecg_rqa-1.png" alt=""/></p> <h3 id="crqa-on-ecg-data">CRQA on ECG Data</h3> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Perform RQA on the ecg data</span><span class="w">
</span><span class="n">rqa_result</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">crqa</span><span class="o">::</span><span class="n">crqa</span><span class="p">(</span><span class="w">
  </span><span class="n">ts1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ecg_resampled</span><span class="o">$</span><span class="n">p1_ecg</span><span class="p">,</span><span class="w">
  </span><span class="n">ts2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ecg_resampled</span><span class="o">$</span><span class="n">p2_ecg</span><span class="p">,</span><span class="w">
  </span><span class="n">embed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">embed_dim</span><span class="p">,</span><span class="w">
  </span><span class="n">delay</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">delay</span><span class="p">,</span><span class="w">
  </span><span class="n">radius</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.1</span><span class="p">,</span><span class="w">   </span><span class="c1"># Threshold distance for recurrence</span><span class="w">
  </span><span class="n">normalize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">  </span><span class="c1"># Normalize data</span><span class="w">
  </span><span class="n">mindiagline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of diagonal lines</span><span class="w">
  </span><span class="n">minvertline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of vertical lines</span><span class="w">
  </span><span class="n">tw</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w">         </span><span class="c1"># Theiler window</span><span class="w">
  </span><span class="n">whiteline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w">
  </span><span class="n">side</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"both"</span><span class="w">
</span><span class="p">)</span><span class="w">

</span><span class="n">print</span><span class="p">(</span><span class="n">rqa_result</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">9</span><span class="p">])</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## $RR
## [1] 8.610749
## 
## $DET
## [1] 87.73045
## 
## $NRLINE
## [1] 710177
## 
## $maxL
## [1] 12
## 
## $L
## [1] 3.739276
## 
## $ENTR
## [1] 1.750265
## 
## $rENTR
## [1] 0.7299173
## 
## $LAM
## [1] 95.15887
## 
## $TT
## [1] 4.84185
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># plot</span><span class="w">
</span><span class="n">crqa</span><span class="o">::</span><span class="n">plot_rp</span><span class="p">(</span><span class="n">rqa_result</span><span class="o">$</span><span class="n">RP</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/ecg_crqa3-1.png" alt=""/></p> <h2 id="interbeat-interval-data-analysis">Interbeat Interval Data Analysis</h2> <p>Note: the data used here is preprocess after appropriate resampling such that we have IBI data for each participant at any given time.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ibi_data</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">read.csv</span><span class="p">(</span><span class="s2">"ibi.csv"</span><span class="p">)</span><span class="w"> </span><span class="c1"># time, IBI1, IBI2</span><span class="w">

</span><span class="c1"># convert time format 2023-06-29 16:06:08.409180</span><span class="w">
</span><span class="n">ibi_data</span><span class="o">$</span><span class="n">time</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">as.POSIXct</span><span class="p">(</span><span class="n">ibi_data</span><span class="o">$</span><span class="n">time</span><span class="p">,</span><span class="w"> </span><span class="n">format</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"%Y-%m-%d %H:%M:%OS"</span><span class="p">)</span><span class="w">

</span><span class="c1"># plot IBI time series</span><span class="w">
</span><span class="c1"># Reshape the data to a long format</span><span class="w">
</span><span class="n">ibi_long</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ibi_data</span><span class="w"> </span><span class="o">%&gt;%</span><span class="w">
  </span><span class="n">pivot_longer</span><span class="p">(</span><span class="n">cols</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="n">IBI1</span><span class="p">,</span><span class="w"> </span><span class="n">IBI2</span><span class="p">),</span><span class="w"> </span><span class="n">names_to</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"IBI_type"</span><span class="p">,</span><span class="w"> </span><span class="n">values_to</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"IBI_value"</span><span class="p">)</span><span class="w">

</span><span class="c1"># Plot both IBI1 and IBI2 in one plot</span><span class="w">
</span><span class="n">ggplot</span><span class="p">(</span><span class="n">ibi_long</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">time</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">IBI_value</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">IBI_type</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_line</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">labs</span><span class="p">(</span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"IBI Time Series Comparison"</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Time (s)"</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"IBI (ms)"</span><span class="p">,</span><span class="w"> </span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"IBI Type"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">theme_bw</span><span class="p">()</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/hrv_ibi-1.png" alt=""/></p> <h3 id="ami-on-ibi-data">AMI on IBI Data</h3> <p>AMI = Auto Mutual Information AMI is used to determine the time delay for a time series.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Compute AMI</span><span class="w">
</span><span class="n">ami_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">nonlinearTseries</span><span class="o">::</span><span class="n">mutualInformation</span><span class="p">(</span><span class="n">ibi_data</span><span class="o">$</span><span class="n">IBI2</span><span class="p">,</span><span class="w"> 
                                                  </span><span class="n">lag.max</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">50</span><span class="p">,</span><span class="w">
                                                  </span><span class="n">do.plot</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/hrv_ami-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Compute the first minimum of AMI</span><span class="w">
</span><span class="n">amis</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ami_values</span><span class="o">$</span><span class="n">mutual.information</span><span class="w"> 

</span><span class="c1"># fist local minima</span><span class="w">
</span><span class="n">delay</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">which</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="nf">sign</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="n">amis</span><span class="p">)))</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="m">2</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">1</span><span class="w">
</span><span class="n">delay</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">delay</span><span class="p">[</span><span class="m">1</span><span class="p">]</span><span class="w">
</span><span class="n">delay</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## [1] 34
</code></pre></div></div> <h3 id="fnn-on-ibi-data">FNN on IBI Data</h3> <p>FNN is used to determine the minimum embedding dimension for a time series.</p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Compute FNN</span><span class="w">
</span><span class="n">fnn_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">tseriesChaos</span><span class="o">::</span><span class="n">false.nearest</span><span class="p">(</span><span class="n">ibi_data</span><span class="o">$</span><span class="n">IBI1</span><span class="p">,</span><span class="w"> 
                                         </span><span class="n">m</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">15</span><span class="p">,</span><span class="w">
                                         </span><span class="n">d</span><span class="o">=</span><span class="n">delay</span><span class="p">,</span><span class="w">
                                         </span><span class="n">t</span><span class="o">=</span><span class="m">1</span><span class="w">
                                         </span><span class="p">)</span><span class="w">

</span><span class="c1"># plot the FNN values</span><span class="w">
</span><span class="n">tseriesChaos</span><span class="o">::</span><span class="n">plot.false.nearest</span><span class="p">(</span><span class="n">fnn_values</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/hrv_fnn-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Extract the 'fraction' row</span><span class="w">
</span><span class="n">fraction_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">fnn_values</span><span class="p">[</span><span class="s2">"fraction"</span><span class="p">,</span><span class="w"> </span><span class="p">]</span><span class="w">

</span><span class="c1"># Convert to numeric, removing empty values (if any)</span><span class="w">
</span><span class="n">fraction_values</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="nf">as.numeric</span><span class="p">(</span><span class="n">fraction_values</span><span class="p">[</span><span class="o">!</span><span class="nf">is.na</span><span class="p">(</span><span class="n">fraction_values</span><span class="p">)])</span><span class="w">

</span><span class="c1"># Find the first local minimum index in the FNN values</span><span class="w">
</span><span class="n">embed_dim</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">which</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="nf">sign</span><span class="p">(</span><span class="n">diff</span><span class="p">(</span><span class="n">fraction_values</span><span class="p">)))</span><span class="w"> </span><span class="o">&gt;</span><span class="w"> </span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">1</span><span class="w">
</span><span class="n">embed_dim</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">embed_dim</span><span class="p">[</span><span class="m">1</span><span class="p">]</span><span class="w">
</span><span class="n">embed_dim</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## [1] 5
</code></pre></div></div> <h3 id="rqa-on-ibi-data">RQA on IBI Data</h3> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rqa_res</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">crqa</span><span class="o">::</span><span class="n">crqa</span><span class="p">(</span><span class="w">
  </span><span class="n">ts1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ibi_data</span><span class="o">$</span><span class="n">IBI1</span><span class="p">,</span><span class="w">
  </span><span class="n">ts2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ibi_data</span><span class="o">$</span><span class="n">IBI1</span><span class="p">,</span><span class="w">
  </span><span class="n">embed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">embed_dim</span><span class="p">,</span><span class="w">
  </span><span class="n">delay</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">delay</span><span class="p">,</span><span class="w">
  </span><span class="n">radius</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.1</span><span class="p">,</span><span class="w">   </span><span class="c1"># Threshold distance for recurrence</span><span class="w">
  </span><span class="n">normalize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">  </span><span class="c1"># Normalize data</span><span class="w">
  </span><span class="n">mindiagline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of diagonal lines</span><span class="w">
  </span><span class="n">minvertline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of vertical lines</span><span class="w">
  </span><span class="n">tw</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w">         </span><span class="c1"># Theiler window</span><span class="w">
  </span><span class="n">whiteline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w">
  </span><span class="n">side</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"both"</span><span class="w">
</span><span class="p">)</span><span class="w">

</span><span class="c1"># Print RQA results</span><span class="w">
</span><span class="n">print</span><span class="p">(</span><span class="n">rqa_res</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">9</span><span class="p">])</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## $RR
## [1] 0.5278537
## 
## $DET
## [1] 98.91183
## 
## $NRLINE
## [1] 991
## 
## $maxL
## [1] 1648
## 
## $L
## [1] 14.30878
## 
## $ENTR
## [1] 2.408787
## 
## $rENTR
## [1] 0.6670843
## 
## $LAM
## [1] 99.65123
## 
## $TT
## [1] 7.027054
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">crqa</span><span class="o">::</span><span class="n">plot_rp</span><span class="p">(</span><span class="n">rqa_res</span><span class="o">$</span><span class="n">RP</span><span class="p">,</span><span class="w"> </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Recurrence Plot"</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/hrv_rqa-1.png" alt=""/></p> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rqa_res</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">crqa</span><span class="o">::</span><span class="n">crqa</span><span class="p">(</span><span class="w">
  </span><span class="n">ts1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ibi_data</span><span class="o">$</span><span class="n">IBI2</span><span class="p">,</span><span class="w">
  </span><span class="n">ts2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ibi_data</span><span class="o">$</span><span class="n">IBI2</span><span class="p">,</span><span class="w">
  </span><span class="n">embed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">embed_dim</span><span class="p">,</span><span class="w">
  </span><span class="n">delay</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">delay</span><span class="p">,</span><span class="w">
  </span><span class="n">radius</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.2</span><span class="p">,</span><span class="w">   </span><span class="c1"># Threshold distance for recurrence</span><span class="w">
  </span><span class="n">normalize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">  </span><span class="c1"># Normalize data</span><span class="w">
  </span><span class="n">mindiagline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of diagonal lines</span><span class="w">
  </span><span class="n">minvertline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of vertical lines</span><span class="w">
  </span><span class="n">tw</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w">         </span><span class="c1"># Theiler window</span><span class="w">
  </span><span class="n">whiteline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w">
  </span><span class="n">side</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"both"</span><span class="w">
</span><span class="p">)</span><span class="w">

</span><span class="c1"># Print RQA results</span><span class="w">
</span><span class="n">print</span><span class="p">(</span><span class="n">rqa_res</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">9</span><span class="p">])</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## $RR
## [1] 1.849624
## 
## $DET
## [1] 99.65362
## 
## $NRLINE
## [1] 3409
## 
## $maxL
## [1] 1648
## 
## $L
## [1] 14.68466
## 
## $ENTR
## [1] 2.974163
## 
## $rENTR
## [1] 0.7178534
## 
## $LAM
## [1] 99.85269
## 
## $TT
## [1] 15.18619
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">crqa</span><span class="o">::</span><span class="n">plot_rp</span><span class="p">(</span><span class="n">rqa_res</span><span class="o">$</span><span class="n">RP</span><span class="p">,</span><span class="w"> </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Recurrence Plot"</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/hrv_rqa2-1.png" alt=""/></p> <h3 id="crqa-on-ibi-data">CRQA on IBI Data</h3> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">rqa_res</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">crqa</span><span class="o">::</span><span class="n">crqa</span><span class="p">(</span><span class="w">
  </span><span class="n">ts1</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ibi_data</span><span class="o">$</span><span class="n">IBI1</span><span class="p">,</span><span class="w">
  </span><span class="n">ts2</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ibi_data</span><span class="o">$</span><span class="n">IBI2</span><span class="p">,</span><span class="w">
  </span><span class="n">embed</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">embed_dim</span><span class="p">,</span><span class="w">
  </span><span class="n">delay</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">delay</span><span class="p">,</span><span class="w">
  </span><span class="n">radius</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.2</span><span class="p">,</span><span class="w">   </span><span class="c1"># Threshold distance for recurrence</span><span class="w">
  </span><span class="n">normalize</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w">  </span><span class="c1"># Normalize data</span><span class="w">
  </span><span class="n">mindiagline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of diagonal lines</span><span class="w">
  </span><span class="n">minvertline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="c1"># Minimum length of vertical lines</span><span class="w">
  </span><span class="n">tw</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w">         </span><span class="c1"># Theiler window</span><span class="w">
  </span><span class="n">whiteline</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">FALSE</span><span class="p">,</span><span class="w">
  </span><span class="n">side</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"both"</span><span class="w">
</span><span class="p">)</span><span class="w">

</span><span class="c1"># Print RQA results</span><span class="w">
</span><span class="n">print</span><span class="p">(</span><span class="n">rqa_res</span><span class="p">[</span><span class="m">1</span><span class="o">:</span><span class="m">9</span><span class="p">])</span><span class="w">
</span></code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>## $RR
## [1] 0.3534366
## 
## $DET
## [1] 98.63527
## 
## $NRLINE
## [1] 1590
## 
## $maxL
## [1] 31
## 
## $L
## [1] 5.954717
## 
## $ENTR
## [1] 2.360008
## 
## $rENTR
## [1] 0.7331776
## 
## $LAM
## [1] 99.30201
## 
## $TT
## [1] 8.571942
</code></pre></div></div> <div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">crqa</span><span class="o">::</span><span class="n">plot_rp</span><span class="p">(</span><span class="n">rqa_res</span><span class="o">$</span><span class="n">RP</span><span class="p">,</span><span class="w"> </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Recurrence Plot"</span><span class="p">)</span><span class="w">
</span></code></pre></div></div> <p><img src="/assets/img/crqasynch/hrv_crqax-1.png" alt=""/></p> <h2 id="multi-dimensional-rqa-mdrqa">Multi-dimensional RQA (MdRQA)</h2> <p><strong>Why do we need a multi-dimensional extension of RQA?</strong> Increasingly behaviorial analysis is capturing physiological data such as heart rate, respiration, and dermal activity. These data streams are often interdependent, reflecting the complex interactions between physiological and psychological processes. Traditional RQA is limited to analyzing a single time series, making it difficult to capture the rich dynamics of multi-dimensional data. Multi-dimensional RQA (MdRQA) extends the RQA framework to analyze the similarity between multiple time series, offering a more nuanced understanding of complex behavioral interactions.</p> <p><a class="citation" href="#wallot2019multidimensional">(Wallot, 2019)</a> provides a comprehensive overview of MdRQA, and I highly recommend reading it. In the next section, we will discuss the basic idea behind MdRQA and how it can be used to analyze multi-dimensional data.</p> <p><a href="https://github.com/Wallot/MdRQA">https://github.com/Wallot/MdRQA</a></p> <h2 id="multi-dimensional-crqa-mdcrqa">Multi-dimensional CRQA (MdCRQA)</h2> <p>Multi-dimensional CRQA (MdCRQA) extends the CRQA framework to analyze the similarity between multiple time series. MdCRQA is a tool for studying the complex interactions between physiological and psychological processes, providing insights into the dynamics of multi-dimensional data.</p> <p><a href="https://github.com/Wallot/MdCRQA">https://github.com/Wallot/MdCRQA</a></p> <h2 id="conclusion">Conclusion</h2> <p>In this post, we discussed the basics of Recurrence Quantification Analysis (RQA) and its application to physiological data. We covered the key concepts of RQA, including Recurrence Rate (RR), Determinism (DET), and Entropy (ENT), and demonstrated how to perform RQA on different types of physiological data, including the Lorenz attractor, electrocardiogram (ECG) data, and interbeat interval (IBI) data. We also discussed the importance of embedding dimension and time delay in RQA and how to determine these parameters using the Average Mutual Information (AMI) and False Nearest Neighbors (FNN) methods.</p> <p>MdRQA and MdCRQA will be covered in a separate post soon. A lot of the content in this post is inspired and sometimes directly borrowed from the excellent resources <a class="citation" href="#carello2005nonlinear">(Carello &amp; Moreno, 2005; Webber Jr &amp; Zbilut, 2005; Wallot, 2019; Wallot &amp; Leonardi, 2018; Eloy et al., 2023)</a>. I highly recommend reading them for a deeper understanding of the concepts discussed here.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Contents In this post, we will discuss the Cross Recurrence Quantification Analysis (CRQA) method. CRQA is a method to quantify the degree of similarity between two time series. It is a generalization of Recurrence Quantification Analysis (RQA) to two time series. We will start by discussing the basic idea behind RQA and then move on to CRQA using illustrative examples. We also discuss the multi-dimensional generalization of RQA and CRQA, which allows us to analyze the similarity between multiple time series.]]></summary></entry><entry><title type="html">The garden of dreams</title><link href="https://nimrobotics.github.io/blog/2024/garden-of-dreams/" rel="alternate" type="text/html" title="The garden of dreams"/><published>2024-10-27T20:18:22+00:00</published><updated>2024-10-27T20:18:22+00:00</updated><id>https://nimrobotics.github.io/blog/2024/garden-of-dreams</id><content type="html" xml:base="https://nimrobotics.github.io/blog/2024/garden-of-dreams/"><![CDATA[<p>After years of procrastination, I finally decided to get back to writing and sharing my thoughts. A lot has happened in the past few years, and I have been fortunate to have had some amazing experiences. I have met some incredible people, traveled to new places, and learned a lot about myself. I have also had my fair share of challenges. One thing that I am super proud and grateful about is my brand new garden :seedling:. I have always loved plants and gardening, and I have been dreaming of having my own garden for years but being in school and the constant moving made it impossible. By pure chance, I discovered a beautiful community garden and ofcourse I had to get a plot. March 2024, I got my plot and I have been spending most of my weekends there since then. There’s something so pure about touching the earth and walking barefoot on the land. A lot of nice physical excercise too! Although, I have always been close to plants, this was my first hand experience doing actual gardening. I have learned so much in the past few months and I am so grateful for the opportunity to grow my own food. I have planted a variety of vegetables and herbs including tomatoes (cherry, stipped german, amish slicer gold), peppers, cucumbers, zucchini, basil, mint, cilantro, carrots, radishes, mustard, cabbage, spinach, onions, butternut squash, and more :seedling: :tomato: :hot_pepper: :cucumber: :herb: :carrot: :onion:. Most of these were planted from tiny seeds and it’s been so rewarding to see them grow and produce. It’s magical to see such a big thing come out of a teeny tiny seed. I could only start growing in mid April (it was snowing untill first week of April!) and unfortunately, I had to leave for California in May. I was so worried about my plants but I had some amazing friends (VB and MK) who took care of them while I was away. I am so grateful for their help and I am so happy to see my garden thriving. I came back to ripe and delicious cherry tomatoes, cucumbers, and zucchinis in September and I was so happy to see that my plants survived the summer. I have been harvesting and eating fresh veggies from my garden and it’s been so rewarding. Its October and time to prepare the garden for winter, until next year! I am so grateful for this experience and I am looking forward to growing more next year. One outcome of this exploration was that it allowed me to meet a lot of new people and make new friends.</p> <p>The hardest part was the beginning, removing all the dried out weed, making beds and getting things started. The first batch of seeds I planted in late March never germinanted because it was probably too cold but I had success next month. A lot of the produce was also eaten by wild animals! hope they had fun haha. One mroe thing I learned is that plants are very very resilient and they will thrive even if you water them just once every 7-10 days or maybe even less frequent sometimes. This is just the beginning, I hope to grow most of my food on my own one day!</p> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/blog/garden/garden_im2-480.webp 480w,/assets/img/blog/garden/garden_im2-800.webp 800w,/assets/img/blog/garden/garden_im2-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/blog/garden/garden_im2.jpg" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/blog/garden/garden_im4-480.webp 480w,/assets/img/blog/garden/garden_im4-800.webp 800w,/assets/img/blog/garden/garden_im4-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/blog/garden/garden_im4.jpg" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/blog/garden/garden_im3-480.webp 480w,/assets/img/blog/garden/garden_im3-800.webp 800w,/assets/img/blog/garden/garden_im3-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/blog/garden/garden_im3.jpg" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure>]]></content><author><name></name></author><category term="personal"/><summary type="html"><![CDATA[After years of procrastination, I finally decided to get back to writing and sharing my thoughts. A lot has happened in the past few years, and I have been fortunate to have had some amazing experiences. I have met some incredible people, traveled to new places, and learned a lot about myself. I have also had my fair share of challenges. One thing that I am super proud and grateful about is my brand new garden :seedling:. I have always loved plants and gardening, and I have been dreaming of having my own garden for years but being in school and the constant moving made it impossible. By pure chance, I discovered a beautiful community garden and ofcourse I had to get a plot. March 2024, I got my plot and I have been spending most of my weekends there since then. There’s something so pure about touching the earth and walking barefoot on the land. A lot of nice physical excercise too! Although, I have always been close to plants, this was my first hand experience doing actual gardening. I have learned so much in the past few months and I am so grateful for the opportunity to grow my own food. I have planted a variety of vegetables and herbs including tomatoes (cherry, stipped german, amish slicer gold), peppers, cucumbers, zucchini, basil, mint, cilantro, carrots, radishes, mustard, cabbage, spinach, onions, butternut squash, and more :seedling: :tomato: :hot_pepper: :cucumber: :herb: :carrot: :onion:. Most of these were planted from tiny seeds and it’s been so rewarding to see them grow and produce. It’s magical to see such a big thing come out of a teeny tiny seed. I could only start growing in mid April (it was snowing untill first week of April!) and unfortunately, I had to leave for California in May. I was so worried about my plants but I had some amazing friends (VB and MK) who took care of them while I was away. I am so grateful for their help and I am so happy to see my garden thriving. I came back to ripe and delicious cherry tomatoes, cucumbers, and zucchinis in September and I was so happy to see that my plants survived the summer. I have been harvesting and eating fresh veggies from my garden and it’s been so rewarding. Its October and time to prepare the garden for winter, until next year! I am so grateful for this experience and I am looking forward to growing more next year. One outcome of this exploration was that it allowed me to meet a lot of new people and make new friends.]]></summary></entry><entry><title type="html">Introduction to Lab Streaming Layer (LSL)</title><link href="https://nimrobotics.github.io/blog/2022/fnirsl-lsl/" rel="alternate" type="text/html" title="Introduction to Lab Streaming Layer (LSL)"/><published>2022-07-10T21:53:22+00:00</published><updated>2022-07-10T21:53:22+00:00</updated><id>https://nimrobotics.github.io/blog/2022/fnirsl-lsl</id><content type="html" xml:base="https://nimrobotics.github.io/blog/2022/fnirsl-lsl/"><![CDATA[<p><strong>Contents</strong></p> <ul id="markdown-toc"> <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li> <li><a href="#installation" id="markdown-toc-installation">Installation</a></li> <li><a href="#code-example" id="markdown-toc-code-example">Code Example</a></li> <li><a href="#usage" id="markdown-toc-usage">Usage</a></li> </ul> <h1 id="introduction">Introduction</h1> <p>Lab Streaming Layer or LSL is a system for measuring, monitoring, and recording time-synchonized data streams during experiments. It is a tool for data acquisition and analysis that is used in neuroscience, psychology, and other fields. It can be used with a large variety of data types, including EEG, fNIRS, EMG, eye-tracking, and MEG.</p> <h1 id="installation">Installation</h1> <ul> <li>Install <a href="https://docs.microsoft.com/en-us/windows/wsl/about">Windows Subsystem for Linux</a> as this pipline is based on Linux. Below two versions have been tested. <ul> <li>Ubuntu 18.04 LTS (Bionic)</li> <li>Ubuntu 20.04 LTS (Focal)</li> </ul> </li> <li>Install the core LSL library <ul> <li>Download the latest version (*.deb) from GitHub <a href="https://github.com/sccn/liblsl">LSL library</a> from the releases page.</li> <li>Switch to the directory where the *.deb is located.</li> <li>Change permissions to <code class="language-plaintext highlighter-rouge">chmod +x *.deb</code></li> <li>Install the *.deb <code class="language-plaintext highlighter-rouge">sudo dpkg -i liblsl-bin_*.deb</code></li> </ul> </li> <li>Install the python binding <a href="https://github.com/labstreaminglayer/liblsl-Python">pylsl</a> <ul> <li><code class="language-plaintext highlighter-rouge">pip3 install pylsl</code></li> </ul> </li> </ul> <h1 id="code-example">Code Example</h1> <script src="https://gist.github.com/nimRobotics/ada4c781ef8a7a3539a6f447a3fc4f7a.js"></script> <h1 id="usage">Usage</h1> <p>1.Run the above code to start streaming triggers.: * <code class="language-plaintext highlighter-rouge">python3 code.py</code> * Alter the parameters in the example progam if needed. 2.Open Aurora on Windows and connect with the fNIRS device. 3.On the configuration page, select the desired probmap and click <code class="language-plaintext highlighter-rouge">Edit</code> as shown in the image below.</p> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/blog/lsl/2.PNG-480.webp 480w,/assets/img/blog/lsl/2.PNG-800.webp 800w,/assets/img/blog/lsl/2.PNG-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/blog/lsl/2.PNG" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Fig.1: Configuration screen in NIRx Aurora </div> <p>4.In the <code class="language-plaintext highlighter-rouge">Basic parameters</code> tab, add triggers (ideally, the same number as defined in the python program).</p> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/blog/lsl/3.PNG-480.webp 480w,/assets/img/blog/lsl/3.PNG-800.webp 800w,/assets/img/blog/lsl/3.PNG-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/blog/lsl/3.PNG" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Fig.2: Setting Triggers in Aurora </div> <p>5.Verify that the <code class="language-plaintext highlighter-rouge">Data out stream name</code> and <code class="language-plaintext highlighter-rouge">Data in stream name</code> same as defined in the python program.</p> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/blog/lsl/4.PNG-480.webp 480w,/assets/img/blog/lsl/4.PNG-800.webp 800w,/assets/img/blog/lsl/4.PNG-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/blog/lsl/4.PNG" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Fig.3: Configuring stream names </div> <p>6.Now you should be able to see the data with triggers in the Aurora console.</p> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/blog/lsl/5.PNG-480.webp 480w,/assets/img/blog/lsl/5.PNG-800.webp 800w,/assets/img/blog/lsl/5.PNG-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/blog/lsl/5.PNG" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Fig.4: Triggers placed by the python program via LSL </div> <p>7.The trigger input may be automated to your needs.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Lab Streaming Layer or LSL is a system for measuring, monitoring, and recording ...]]></summary></entry><entry><title type="html">Robotics: Past, present, and future</title><link href="https://nimrobotics.github.io/blog/2021/robotics_trend/" rel="alternate" type="text/html" title="Robotics: Past, present, and future"/><published>2021-05-15T04:41:54+00:00</published><updated>2021-05-15T04:41:54+00:00</updated><id>https://nimrobotics.github.io/blog/2021/robotics_trend</id><content type="html" xml:base="https://nimrobotics.github.io/blog/2021/robotics_trend/"><![CDATA[ <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/blog/ravens-480.webp 480w,/assets/img/blog/ravens-800.webp 800w,/assets/img/blog/ravens-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/blog/ravens.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Google Ravens showing 10 tabletop simulated rearrangement tasks in PyBullet [1] </div> <p>This article outlines the vision, goal, and challenges faced in robotics research. It provides a brief overview of the current state of robotics, its research, and the challenges faced by the robotics community. Towards the end, it talks about the exciting future the robotics community as a whole has to offer.</p> <p>Significant developments in robotics started with the industrial revolution. The early robots very quite simple in function and were employed only for particular purposes. The majority of them were specifically developed for the industrial environment, and other applications we seen in today’s world were either absent or were in infancy. Today, developments in robotics are happening faster than ever, with new ideas and papers popping up at unprecedented rates. To a person unacquainted with the latest research and developments, robotics might appear quite stale and something limited to the factory environment with a repetitive hardcoded task. In reality, the field of robotics is exceptionally vast in itself and requires knowledge from a variety of disciplines. The majority of the current work focuses on one of these subareas: robot learning (tasks such a grasping, walking), navigation (vision, SLAM, LiDARs), design (mechanism, compliant), Human-Robot-Interaction, and more. The earlier notion of robots working in the confined factory environment is rapidly changing, and robotics makes its space in numerous newly envisioned areas.</p> <p>A major portion of the robotics research at present focuses on building intelligent machines with human-like ability. One of the goals of robotic research is to generalize to the maximum extent possible. For example, instead of creating a robot to grasp a specific object (or type), the focus is on grasping various object types (rigid, deformable, soft). A major challenging task is a sim to real transfer, many a time the research community needs to devise solutions by using physics-based simulators, as real-world experiments are very costly. Such simulators are quite popular now (i.e., PyBullet, OpenAI Gym, etc.) for learning-based tasks as they are fast, cheap, and easily accessible. The major problem arises when the learning is transferred to the real-world system, i.e., the physical robot, due to several potential issues such as inaccurate model representation resulting in undesirable behaviors. At times, research work doesn’t include real-world experiments due to various constraints, severely affecting research reproducibility. Furthermore, due to the dynamic nature of the real-world system and inaccuracies, it might not be possible to accurately replicate the experiments even if they might have been tested previously in the real world. Reproducibility is challenging in this area, and it significantly hampers the research output. Even though there are methods that real experiments have validated, there is a high tendency of failure as often the tests are conducted in a well-designed lab setting which eliminates the harshness and new unseen challenges of the actual environment. This level of generalization is a big challenge and needs to be addressed to unleash the robotics capabilities.</p> <p>The hardcoded nature of robots and confined workspaces is no longer the optimal solution for many tasks requiring flexibility and agility, and there is a need to develop. More can be achieved by designing unrestrained intelligent machines. Advancements in robotics can significantly transform people’s lives in numerous aspects, including rescue, rehabilitation, security, and social health. Another emerging area is the Human-Robot-Interaction (HRI), which shall help improve the combined system performance by inculcating trust in the two players. At present, there is a myriad of dangerous tasks such as hazardous rescue operations and tasks such as manual scavenging that people are conditioned to perform due to a lack of alternatives. With the ongoing research and advancement, robotics shall transform people’s lives fundamentally and in several aspects. Building robots similar to humans in terms of agility, perception, and motor skills hold significant weight and potential.</p> <p><strong><em>References</em></strong></p> <p>[1] Zeng, Andy, et al. “Transporter networks: Rearranging the visual world for robotic manipulation.” arXiv preprint arXiv:2010.14406 (2020).</p>]]></content><author><name></name></author><category term="research"/><category term="robotics"/><category term="research"/><summary type="html"><![CDATA[This article outlines the vision, goal, and challenges faced in robotics research...]]></summary></entry><entry><title type="html">Clustering algorithms</title><link href="https://nimrobotics.github.io/blog/2020/clustering/" rel="alternate" type="text/html" title="Clustering algorithms"/><published>2020-07-01T14:40:56+00:00</published><updated>2020-07-01T14:40:56+00:00</updated><id>https://nimrobotics.github.io/blog/2020/clustering</id><content type="html" xml:base="https://nimrobotics.github.io/blog/2020/clustering/"><![CDATA[<p><strong>Contents</strong></p> <ul id="markdown-toc"> <li><a href="#em-clustering" id="markdown-toc-em-clustering">EM clustering</a> <ul> <li><a href="#gaussian-mixture-model-gmm" id="markdown-toc-gaussian-mixture-model-gmm">Gaussian Mixture Model (GMM)</a></li> <li><a href="#expectation-maximization-to-solve-gmm-parameters" id="markdown-toc-expectation-maximization-to-solve-gmm-parameters">Expectation maximization to solve GMM parameters</a></li> </ul> </li> <li><a href="#k-means-clustering" id="markdown-toc-k-means-clustering">k-means clustering</a> <ul> <li><a href="#below-code-shows-the-implementation" id="markdown-toc-below-code-shows-the-implementation">Below code shows the implementation</a></li> </ul> </li> <li><a href="#hierarchical-clustering" id="markdown-toc-hierarchical-clustering">Hierarchical clustering</a></li> </ul> <p>Clustering is a unsupervised Machine Learning algorithm that groups a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). [Suggested introductory <a href="https://en.wikipedia.org/wiki/Cluster_analysis">reading</a>]. Here, I shall talk about three of them namely EM clustering, k-means and hierarchical clustering.</p> <h3 id="em-clustering">EM clustering</h3> <hr/> <p><br/> We have \(N\) data samples (unlabelled) \(D= \{ x_1,x_2,\dots,x_N \}\) spanning \(c\) classes \(w_1,w_2,\dots,w_c\). In order to simplify our analysis let’s assume \(c=3\) and \(N=500\). We can write the prior and likelihood as follows</p> <table> <thead> <tr> <th>Prior</th> <th>Likelihood</th> </tr> </thead> <tbody> <tr> <td>\(P(w_1)\)</td> <td>\(P(x/w_1) \sim \mathcal{N}(\mu_1,‎‎\sum_1)\)</td> </tr> <tr> <td>\(P(w_2)\)</td> <td>\(P(x/w_2) \sim \mathcal{N}(\mu_2,‎‎\sum_2)\)</td> </tr> <tr> <td>\(P(w_3)\)</td> <td>\(P(x/w_3) \sim \mathcal{N}(\mu_3,‎‎\sum_3)\)</td> </tr> </tbody> </table> <p><br/></p> <h4 id="gaussian-mixture-model-gmm">Gaussian Mixture Model (GMM)</h4> <p>The problem can be formulated as a GMM</p> \[P(x/\theta)=P(w_1)P(x/w_1)+P(w_2)P(x/w_2)+P(w_3)P(x/w_3)\] <p>Given, the data points \(\{ x_1,x_2,\dots,x_N \}\) our goal is to find the parameter \(\theta\) (consists of \(\mu_i \text{, } \sum_i \text{ and } P(w_i)\)) using MLE (ML Estimation).</p> <h4 id="expectation-maximization-to-solve-gmm-parameters">Expectation maximization to solve GMM parameters</h4> <p>We can find the parameters using an iterative scheme. We can initialize \((\mu_i,‎‎\sum_i)\) with some random values and then iterate over the expectation and maximization steps until convergance is achieved.</p> <p><strong>Expectation</strong></p> <p>Assuing \((\mu_1,‎‎\sum_1)\) are available, we can perform Bayesian classification on \(\{ x_1,x_2,\dots,x_n \}\) using likelihood at some iteration as</p> \[x_k \in w_i \text{ iff } P(x_k/w_i)&gt;P(x_k/w_j) \forall j \neq i\] <p><strong>Maximization</strong></p> <p>Given some \(x_k\), then what are \((\mu_i,‎‎\sum_i)\) that maximize for those \(x_k\) to have \(P(x_k/w_i) \sim \mathcal{N}(\mu_i,‎‎\sum_i)\)?</p> \[\{ x_1,x_2,\dots,x_n \} \in w_i\] <p>Using MLE,</p> <p>\(\mu_i = \frac{1}{N_i} \sum_k x_k \forall x_k \in w_i\) \(\sum_i = \frac{1}{N_i} \sum_k (x_k-\mu_i)(x_k-\mu_i)^T \forall x_k \in w_i\)</p> <p>Thus, we have the updated values for $(\mu_i,‎‎\sum_i)$ and can now again perform expectation step in an iterative way untill convergance.</p> <h3 id="k-means-clustering">k-means clustering</h3> <hr/> <p><br/> Let us assume identity covarience matrix i.e. $ | \sum | = 1 $, we can write \(P(x_k/w_i) = \frac{1}{(2\pi)^{d/2}} e^{ \frac{-1}{2} (x_k-\mu_i)^T (x_k-\mu_i)} = \frac{1}{(2\pi)^{d/2}} e^{ \frac{-1}{2} \| x_k-\mu_i \| ^2 }\)</p> <p>Algorithm</p> <ul> <li>Descide the number of clusters, $c$</li> <li>Assume $\mu_i$ for classes $1:c$</li> <li>Expectation <ul> <li>\(x_k \in\) if \(P(x_k/w_i)&gt;P(x_k/w_j) \forall i \neq j\)</li> <li>if \(\| x_k-\mu_i \| ^2 \leq \| x_k-\mu_j \| ^2 \forall i \neq j, \text{ then } x_k \in w_i\)</li> </ul> </li> <li>Maximization \(\mu_i = \frac{1}{N_i} \sum_k x_k \forall x_k \in w_i\)</li> <li>Repeat step 2 and 3 untill convergence, \(\| \mu_i^t-\mu_i^{t-1} \| ^2 \to \text{constant}\)</li> </ul> <h4 id="below-code-shows-the-implementation">Below code shows the implementation</h4> <script src="https://gist.github.com/nimRobotics/063e52437046f2db3ab39cede2c20a2a.js"></script> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/blog/cluster_2-480.webp 480w,/assets/img/blog/cluster_2-800.webp 800w,/assets/img/blog/cluster_2-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/blog/cluster_2.jpg" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Fig.1: Final clusters obtained via k-means </div> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/blog/cluster_2-480.webp 480w,/assets/img/blog/cluster_2-800.webp 800w,/assets/img/blog/cluster_2-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/blog/cluster_2.jpg" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Fig.2: Final clusters obtained via k-means </div> <p>As shown in the above two figures the k-means algorithm correctly clusters the data points into three clusters.</p> <h3 id="hierarchical-clustering">Hierarchical clustering</h3> <hr/> <p><br/></p> <p>Hierarchical clustering is a clustering method which seeks to build a hierarchy of clusters based on some distance or similarity meteric. This can be further classified as</p> <ul> <li>Agglomerative clustering: (“bottom-up” approach) each point starts as a seperate individual cluster and merges into a single cluster.</li> <li>Divisive clustering: (“top-down” approach) starts with one single cluster and keeps on dividing upto individual point clusters</li> <li>Isodata clustering: combines features of both agglomerative and divisive clustering to achieve optimal clusters</li> </ul> <p>Based on the distance meteric used the agglomerative clustering can be further classified as</p> <ul> <li>Single link or Nearest neighbour <ul> <li>Distance between two clusters \(c_1\) and \(c_2\), \(d = \min \| X-Y \|^2 \text{ where, } X \in c_1 \text{ and } Y \in c_2\)</li> </ul> </li> <li>Complete link or farthest neighbour <ul> <li>Distance between two clusters \(c_1\) and \(c_2\), \(d = \max \| X-Y \|^2 \text{ where, } X \in c_1 \text{ and } Y \in c_2\)</li> </ul> </li> </ul> <p><strong>Example: Single Link</strong></p> <p>Suppose that we have the given points \(x_1,x2,\dots,x7\) with the distances shown below.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>x1 ------- x2 ------- x3 ------- x4 ------- x5 ------- x6 ------- x7
     1          1.1        1.2        1.3        1.4        1.5 
</code></pre></div></div> <p>To form clusters, we treat the individual points as clusters and grow them. Find the distance between the clusters using \(d = \min \| X-Y \|^2 \text{ where, } X \in c_1 \text{ and } Y \in c_2\) and merge \(c_i\) with \(c_j\) if \(d(c_i,c_j) \leq d(c_i,c_k) \forall k \neq j\). Finally we can obtain</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>x1         x2         x3         x4         x5         x6         x7
|          |          |           |         |          |          |
|          |          |           |         |          |          |
------------ 1        |           |         |          |          |
     |                |           |         |          |          |
     |                |           |         |          |          |
     ------------------ 1.1       |         |          |          |
              |                   |         |          |          |
              |                   |         |          |          |
              --------------------- 1.2     |          |          |
                        |                   |          |          |
                        |                   |          |          |
                        --------------------- 1.3      |          |
                                   |                   |          |
                                   |                   |          |
                                   --------------------- 1.4      |
                                             |                    |
                                             |                    |
                                             ---------------------- 1.5

</code></pre></div></div> <p><strong>Example: Complete Link</strong></p> <p>We can use the same example distances given in single link example. Here, we can find the distance between the clusters using \(d = \max \| X-Y \|^2 \text{ where, } X \in c_1 \text{ and } Y \in c_2\) and merge \(c_i\) with \(c_j\) if \(d(c_i,c_j) \leq d(c_i,c_k) \forall k \neq j\). Finally we can obtain</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>x1         x2         x3         x4         x5         x6         x7
|          |          |           |         |          |          |
|          |          |           |         |          |          |
------------ 1        |           |         |          |          |
     |                ------------- 1.2     |          |          |
     |                      |               ------------ 1.4      |
     |                      |                     |               |    
     |                      |                     |               |  
     |                      |                     ----------------- 2.9  
     ------------------------ 3.3                         |
              |                                           |
              |                                           |
              --------------------------------------------- 7.5
                     
</code></pre></div></div>]]></content><author><name></name></author><category term="machine-learning"/><category term="machine-learning"/><category term="clustering"/><category term="algorithms"/><summary type="html"><![CDATA[This post shall walk you through various clustering algorithms and the math behind them...]]></summary></entry><entry><title type="html">Introduction to Neural Networks</title><link href="https://nimrobotics.github.io/blog/2020/nn/" rel="alternate" type="text/html" title="Introduction to Neural Networks"/><published>2020-04-13T06:46:22+00:00</published><updated>2020-04-13T06:46:22+00:00</updated><id>https://nimrobotics.github.io/blog/2020/nn</id><content type="html" xml:base="https://nimrobotics.github.io/blog/2020/nn/"><![CDATA[<p><strong>Contents</strong></p> <ul id="markdown-toc"> <li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li> <li><a href="#forward-pass" id="markdown-toc-forward-pass">Forward pass</a></li> <li><a href="#backward-pass" id="markdown-toc-backward-pass">Backward pass</a></li> <li><a href="#python-implementation" id="markdown-toc-python-implementation">Python implementation</a></li> </ul> <h1 id="introduction">Introduction</h1> <hr/> <p><br/> In this blog we shall see how to build a Feedforward Neural Network from scratch, we shall use this for performing MNIST digit classification. First, we shall bring out the mathematical framework of the feed-forward neural network. This network shall use sigmoid activation for the hidden layers and softmax for the last layer with multi-class cross entropy loss (see later sections for details). Before proceeding any further we shall now define a few notations and terminologies.</p> <ul> <li> <p>\(l\) indicates a layer in the network, here, \(1 \leq l \leq N\) where $N$ represents the numbers of layers in the network (Note: as per the convention input layer is not counted when we say $N$-layer neural network. Furthermore, for the network shown in Fig.1 $N=2$).</p> </li> <li> <p>Subscripts $k,j,i,\dots$ usually denotes neuron indices in layers $l=N,N-1,N-2,\dots$</p> </li> <li> <p>$z_k^l$ represents the weighted sum of activations from the previous layer at layer $l$. That is,</p> \[z_k^l=\sum_j w_{kj}a_j^{l-1}+b_k\] </li> <li> <p>$a^l_k$ represents the neuron activations at layer $l$, $a^l_k=f(s^l_k)$, where $f(.)$ is the activation function. We shall be using softmax activation for the last layer $(l=N)$</p> \[a^N_k = \frac{e^{z_k^N}}{\sum_c e^{z_c^N}}\] <p>Here, $c$ is the number of classes. For all other layers $(l\neq N)$ we shall use the sigmoid activation function.</p> \[a^l_k = \frac{1}{1+e^{-z_k^l}}\] </li> <li> <p>$y_k$ is the ground truth (one-hot encoded) vector for the $k^{th}$ sample.</p> </li> <li> <p>$\hat{y}_k$ is the predicted vector for the $k^{th}$ sample.</p> </li> </ul> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/blog/nn_vis2-480.webp 480w,/assets/img/blog/nn_vis2-800.webp 800w,/assets/img/blog/nn_vis2-1400.webp 1400w," sizes="95vw" type="image/webp"/> <img src="/assets/img/blog/nn_vis2.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> <div class="caption"> Fig.1: A schematic of two layered neural network </div> <h1 id="forward-pass">Forward pass</h1> <hr/> <p><br/> Now let us consider a Neural Network having two layers $(N=2)$ as shown in Fig. [[fig:nn_vis]]. Here, we have $n$ neurons in the input layer (features), $nh$ neurons in the hidden layer and $no$ neurons in the output layer. As an aside, $1\leq i \leq n$, $1\leq j \leq nh$, and $1\leq k \leq no$. We shall now layout the equations for the forward pass through the network. For layer $l=N-1=1$</p> \[z_j^l=\sum_i^n w_{ji}a_i^{l-1}+b_j\] <p>where, $a_i^{l-1}=a_i^0 = x_i^0$. Now we shall pass this weighted sum $s_j^l$ through the activation function $f_1()$ i.e. sigmoid activation.</p> \[a_j^l=f_1(z_j^l)\] <p>This output of layer $l=1$ is now fed to layer $l=2$ i.e. the output layer. For layer $l=N=2$</p> \[z_k^l=\sum_j^{nh} w_{kj}a_j^{l-1}+b_k\] <p>Similarly, we shall fed this weighted sum to another activation function $f_2()$ i.e. softmax activation.</p> \[a_k^l=f_2(z_k^l)\] <p>The final output vector $a_k^l$ contains the probability for the $k^{th}$ class. This class prediction might not make much sense as they take in weights and biases which are randomly initialized. Our prime goal here is to find the weights such that the predicted class probabilities are consistent with the ground truth labels in the training data. In order to achieve this, we need to come up with a metric to measure the goodness (or badness) of the network, this can be done by constructing a loss function. The loss function can then be used to perform optimization by updating weights. We shall be using the multi-class cross-entropy loss in our approach. The loss for a given sample can be calculated using it’s one hot encoded vector $(y)$ and the prediction $\hat{y}$ (which is essentially $a_k^l$ or $a_k^2$ for $l=2$).</p> \[L(\hat{a},a) = -\sum^c_{k=1} y_k \log \hat{y_k}\] <p>This can then be used to calculate loss across all samples (total number of samples: $m$) as</p> \[J(w^1,b^1,\dots) = \frac{1}{m} \sum_{i=1}^mL(\hat{y}^i,y^i)\] <p>One important result that we take directly without any derivation is the gradient of $J$ with respect to $z_k^{l=2}$ (simplified as $z^2$).</p> \[\frac{\partial J}{\partial z^{2}} = \hat{y} - y\] <h1 id="backward-pass">Backward pass</h1> <hr/> <p><br/> In this subsection, we shall bring out the mechanism to update the weights and biases by backpropagating the loss into the network. We shall update the weights using an iterative approach, more precisely using the Stochastic Gradient Descent (<a href="https://en.wikipedia.org/wiki/Stochastic_gradient_descent">SGD</a>). The weights can be updated using</p> \[w_{kj}(t+1) = w_{kj}(t) - \eta \frac{\partial J}{\partial w_{kj}}\] \[w_{ji}(t+1) = w_{ji}(t) - \eta \frac{\partial J}{\partial w_{ji}}\] <p>Here, $\eta$ is a hyperparameter called learning rate. Similarly, the biases can also be updated.</p> \[b_{k}(t+1) = b_{k}(t) - \eta \frac{\partial J}{\partial b_{k}}\] \[b_{j}(t+1) = b_{j}(t) - \eta \frac{\partial J}{\partial b_{j}}\] <p>Now, our goal is to find the gradients. This gradients can be obtained by backpropagating via the network usinf the chain-rule.</p> \[\frac{\partial J}{\partial w_{kj}} = \frac{\partial J}{\partial z^{2}} \frac{\partial z^{2}}{\partial w_{kj}} = (\hat{y} - y) z_k\] \[\frac{\partial J}{\partial b_{k}} = \frac{\partial J}{\partial b_{k}} \frac{\partial z^{2}}{\partial b_{k}} = (\hat{y} - y)\] \[\frac{\partial J}{\partial w_{ji}} = \frac{\partial J}{\partial z^{2}} \frac{\partial z^{2}}{\partial a_{j}} \frac{\partial a_{j}}{\partial z^{1}} \frac{\partial z^{1}}{\partial w_{ji}}= (\hat{y} - y) w_{kj} f_1(z^1)(1-f_1(z^1)) a_i\] \[\frac{\partial J}{\partial b_{j}} = \frac{\partial J}{\partial z^{2}} \frac{\partial z^{2}}{\partial a_{j}} \frac{\partial a_{j}}{\partial z^{1}} \frac{\partial z^{1}}{\partial b_{j}}= (\hat{y} - y) w_{kj} f_1(z^1)(1-f_1(z^1))\] <p>Hence, by using the above set of equations we can run SGD and update the trainable parameters for this two layered network. The same idea can be extended to build multilayered networks.</p> <h1 id="python-implementation">Python implementation</h1> <hr/> <p><br/> The full Python3 impmentation with explaination can be found in this <a href="https://github.com/nimRobotics/vanilla_nn/blob/master/NN_scratch.ipynb">Jupyter notebook</a>. The model was trained using different combinations of activation functions and # of neurons (all models have same learning rate of $0.1$). The test accuracy has been shown in Table 1. The accuracy is highest for the sigmoid activation using $265$ neurons. For the sigmoid activation the accuracy increases with increase in # neurons, while for the ReLU and tanh activation it first increases and then goes stagnant. This can be attributed to the fact that we haven’t applied early stopping in our training process as it is a well known fact that Early stopping is some form of L2 Regularization. We wanted to perform our analysis on same set of parameters. In practice, it is a good idea to use early stopping by choosing some threshold where the validation loss starts increasing by that threshold than the previous validation loss. One important observation here is that performance of ReLU is not as expected. This is attributed to the fact that we required a higher learning rate of 0.1 in SGD for Sigmoid activation function to converge. For lower learning rates the SGD wasn’t converging whereas performance of ReLU was improving significantly improved as expected. Therefore, to test over same parameters we finally chose learning rate to be 0.1.</p> <style>table{border-collapse:collapse;border-spacing:0;border:1px solid #000}th{border:1px solid #000}td{border:1px solid #000}</style> <div align="center"> <table align="center"> <thead> <tr class="header"> <th style="text-align: left;"><p><strong>Activation</strong></p> <p><strong># Neurons</strong></p></th> <th style="text-align: center;"><strong>Sigmoid</strong></th> <th style="text-align: center;"><strong>ReLU</strong></th> <th style="text-align: center;"><strong>Tanh</strong></th> </tr> </thead> <tbody> <tr class="odd"> <td style="text-align: left;"><strong>32</strong></td> <td style="text-align: center;">89.44</td> <td style="text-align: center;">39.12</td> <td style="text-align: center;">71.06</td> </tr> <tr class="even"> <td style="text-align: left;"><strong>64</strong></td> <td style="text-align: center;">89.98</td> <td style="text-align: center;">64.16</td> <td style="text-align: center;">74.40</td> </tr> <tr class="odd"> <td style="text-align: left;"><strong>128</strong></td> <td style="text-align: center;">90.32</td> <td style="text-align: center;">61.82</td> <td style="text-align: center;">69.78</td> </tr> <tr class="even"> <td style="text-align: left;"><strong>256</strong></td> <td style="text-align: center;">91.62</td> <td style="text-align: center;">31.68</td> <td style="text-align: center;">55.06</td> </tr> </tbody> </table> <p>Table 1: Test accuracy using different number of hidden neurons and activations</p> </div> <p>Training time for same set of parameters is shown in Table 3. It can be observed that tanh takes the largest training time whereas ReLU is the fastest. Also training time is increasing with # of neurons as expected.</p> <div align="center"> <table align="center"> <thead> <tr class="header"> <th style="text-align: left;"><p><strong>Activation</strong></p> <p><strong># Neurons</strong></p></th> <th style="text-align: center;"><strong>Sigmoid</strong></th> <th style="text-align: center;"><strong>ReLU</strong></th> <th style="text-align: center;"><strong>Tanh</strong></th> </tr> </thead> <tbody> <tr class="odd"> <td style="text-align: left;"><strong>32</strong></td> <td style="text-align: center;">133.137</td> <td style="text-align: center;">116.511</td> <td style="text-align: center;">145.912</td> </tr> <tr class="even"> <td style="text-align: left;"><strong>64</strong></td> <td style="text-align: center;">170.921</td> <td style="text-align: center;">152.531</td> <td style="text-align: center;">205.702</td> </tr> <tr class="odd"> <td style="text-align: left;"><strong>128</strong></td> <td style="text-align: center;">255.141</td> <td style="text-align: center;">207.650</td> <td style="text-align: center;">298.367</td> </tr> <tr class="even"> <td style="text-align: left;"><strong>256</strong></td> <td style="text-align: center;">371.729</td> <td style="text-align: center;">288.749</td> <td style="text-align: center;">448.806</td> </tr> </tbody> </table> <p>Table 2: Training time <span class="math inline">\((s)\)</span> for different number of hidden neurons and activations</p> </div>]]></content><author><name></name></author><category term="research"/><category term="machine-learning"/><category term="neural-networks"/><summary type="html"><![CDATA[In this blog we shall see how to build a Feedforward Neural Network from scratch, we shall use ...]]></summary></entry></feed>