ResIN R Workshop Bremen CSS 2023

Teaser I

Teaser II

Teaser III

Workshop contents

Data preparation & cleaning
Main algorithm
qgraph and igraph integration & visualization with ggplot
Advanced features of the ResIN function + exercise

Workshop prerequisites

To get started, let’s import some political attitude data from the 2020 wave of the American National Election Studies (ANES). The ResIN package takes care of all computational legwork for the examples presented here; we need to load the remaining packages solely for the purpose of visualization.

## Loading required packages: Here with "pacman" manager 
if(!require("pacman")) install.packages('pacman')
library(pacman)
p_load(ResIN) 
p_load(knitr)
p_load(tidyverse)
p_load(visNetwork)
## devtools::install_github("nsgrantham/ggdark") ## ggdark (fancy plotting) requires devtools install the first time around
p_load(ggdark)

## 2020 Wave of the ANES as example data
anes_timeseries_2020 <- read.csv("anes_timeseries_2020.csv")

1. Data preparation & cleaning

Select items for ResIN network and optional co-variates
Clean data
ResIN node variables: Character, factor, or numeric are accepted. Be sure to categorize continuous variables beforehand and to apply uniform coding within each variable. ResIn handles character lables for numeric values very well - this feature is useful for visualization
Co-variates should be numeric
NA’s: Apply a uniform code for all
Non-monotonic Likert scale items do not need to be recoded <3

1.1 Data preparation - before

V200002	V200003	V200004	V200005	V200006	V200007	V200008
3	2	3	0	-2	-2	-2
3	2	3	0	4	-1	3
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	4	-1	3
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	4	-1	3

1.2 Data preparation - recode missing

## Selecting Res-IN node variables: 
anes_nodes <- dplyr::select(anes_timeseries_2020, c("V201336", "V202257", 
                                                    "V201417", "V201312", 
                                                    "V201416", "V201262", 
                                                    "V202337", "V201258"))

## Recoding missing values (check ANES codebook!)
anes_nodes$V201336[anes_nodes$V201336 %in% c(-9, -8, -7, -6, -5, 99)] <- NA
anes_nodes$V202257[anes_nodes$V202257 %in% c(-9, -8, -7, -6, -5, 99)] <- NA
anes_nodes$V201417[anes_nodes$V201417 %in% c(-9, -8, -7, -6, -5, 99)] <- NA
anes_nodes$V201312[anes_nodes$V201312 %in% c(-9, -8, -7, -6, -5, 99)] <- NA
anes_nodes$V201416[anes_nodes$V201416 %in% c(-9, -8, -7, -6, -5, 99)] <- NA
anes_nodes$V201262[anes_nodes$V201262 %in% c(-9, -8, -7, -6, -5, 99)] <- NA
anes_nodes$V202337[anes_nodes$V202337 %in% c(-9, -8, -7, -6, -5, 99)] <- NA
anes_nodes$V201258[anes_nodes$V201258 %in% c(-9, -8, -7, -6, -5, 99)] <- NA

## Give more recognizable names
colnames(anes_nodes) <- c("abort", "income", "immigr", "welfare", "gay_mar",
                          "environm", "gun_own", "aid_black")

1.4 Data preparation - adding lables

## Re-code response labels to make them more recognizable 
anes_nodes <- anes_nodes %>% dplyr::mutate(
                  abort = dplyr::recode(abort, `1` = "never",
                                        `2` = "spec_cases",
                                        `3` = "in_need",
                                        `4` = "always",
                                        `5` = "other"),
                  
  income = dplyr::recode(income, `1` = "favor",`2` = "oppose", `3` = "neither"),
  immigr = dplyr::recode(immigr, `1` = "send_back", `2` = "guest_prog",`3` = "neither"),
  welfare = dplyr::recode(welfare, `1` = "increase",`2` = "decrease", `3` = "same",),
  gay_mar = dplyr::recode(gay_mar, `1` = "recogn",`2` = "civ_union", `3` = "no_recogn"),
  environm = dplyr::recode(environm, `1` = "regul+++",`2` = "regul++", `3` = "regul+", `4` = "neut",`5` = "regul-", `6` = "regul--", `7` = "regul---"),
  gun_own = dplyr::recode(gun_own, `1` = "more_diffic",`2` = "easier", `3` = "same"),
  aid_black = dplyr::recode(aid_black, `1` = "yes++",`2` = "yes+", `3` = "yes", `4` = "neut",`5` = "no", `6` = "no+", `7` = "no++"))

1.5 Data preparation - after

abort	income	immigr	welfare	gay_mar	environm	gun_own	aid_black
spec_cases	favor	send_back	decrease	no_recogn	regul—	same	no++
always	neither	neither	decrease	recogn	NA	same	neut
always	favor	NA	increase	recogn	regul+++	more_diffic	yes
never	favor	send_back	same	recogn	regul+++	same	NA
in_need	neither	guest_prog	same	civ_union	neut	same	no+
always	neither	neither	same	recogn	NA	more_diffic	yes
in_need	neither	guest_prog	increase	recogn	regul-	more_diffic	no++
always	neither	send_back	same	no_recogn	NA	more_diffic	no++
always	neither	send_back	same	recogn	NA	more_diffic	NA
always	neither	guest_prog	same	civ_union	NA	easier	neut
never	neither	neither	increase	no_recogn	regul+++	more_diffic	yes++
spec_cases	oppose	neither	same	recogn	regul+++	more_diffic	no++
always	neither	send_back	increase	civ_union	neut	more_diffic	no++
always	favor	neither	same	recogn	regul+++	more_diffic	yes
always	oppose	guest_prog	same	civ_union	NA	more_diffic	yes++
in_need	neither	neither	same	recogn	NA	easier	neut
always	favor	neither	same	recogn	regul+	same	neut
always	favor	neither	same	recogn	regul+	more_diffic	yes

2. Main Res-IN algorithim

2.1 Item binarization

2.2 Res-In adjacency matrix

2.3 Force-directed location estimation

2.4 Coordinate rotation (PCA)

2.1.1 Item binarization

Expand original data-set into dummy variables for each possible item response
Results in by data-frame, where denotes the number of columns in the original data-frame, and is the sum of all unique item response options within columns .
For the ANES example, 8 item selection results in a dataframe of binary column vectors.

2.1.2 Item binarization - example

## Using the ResIN function from the ResIN package: 
df_dummies <- ResIN::ResIN(anes_nodes)$df_dummies

## Number of collumns in the binarized, item-response data-frame:
dim(df_dummies)[2]

## [1] 34

abort_always	abort_in_need	abort_never	abort_spec_cases	income_favor	income_neither
0	0	0	1	1	0
1	0	0	0	0	1
1	0	0	0	1	0
0	0	1	0	1	0
0	1	0	0	0	1
1	0	0	0	0	1
0	1	0	0	0	1
1	0	0	0	0	1

2.2 Res-IN adjacency matrix

Compute pairwise correlation matrix
Set within-item correlations to zero
Set all negative correlations to zero (optional but recommended)

2.2.1 ResIN (sparse) correlation matrix (no within-item correlations)

## Generate a symmetric correlation matrix & apply item response lables
res_in_cor <- ResIN(anes_nodes, remove_negative = FALSE,  cluster = FALSE)$adj_matrix

	abort_always	abort_in_need	abort_never	abort_other	abort_spec_cases	income_favor
abort_always	0.00	0.00	0.00	0.00	0.00	0.48
abort_in_need	0.00	0.00	0.00	0.00	0.00	-0.12
abort_never	0.00	0.00	0.00	0.00	0.00	-0.36
abort_other	0.00	0.00	0.00	0.00	0.00	-0.05
abort_spec_cases	0.00	0.00	0.00	0.00	0.00	-0.36
income_favor	0.48	-0.12	-0.36	-0.05	-0.36	0.00
income_neither	-0.10	0.04	0.12	0.00	0.04	0.00
income_oppose	-0.47	0.10	0.27	0.06	0.34	0.00
immigr_guest_prog	-0.18	0.06	0.03	0.03	0.14	-0.16

2.2.2 Zero-out all remaining negative correlations (default & strongly recommended)

res_in_cor <- ResIN(anes_nodes, remove_negative = TRUE)$adj_matrix

	abort_always	abort_in_need	abort_never	abort_other	abort_spec_cases	income_favor
abort_always	0.00	0.00	0.00	0.00	0.00	0.48
abort_in_need	0.00	0.00	0.00	0.00	0.00	0.00
abort_never	0.00	0.00	0.00	0.00	0.00	0.00
abort_other	0.00	0.00	0.00	0.00	0.00	0.00
abort_spec_cases	0.00	0.00	0.00	0.00	0.00	0.00
income_favor	0.48	0.00	0.00	0.00	0.00	0.00
income_neither	0.00	0.04	0.12	0.00	0.04	0.00
income_oppose	0.00	0.10	0.27	0.06	0.34	0.00
immigr_guest_prog	0.00	0.06	0.03	0.03	0.14	0.00

2.3 Force-directed location estimation

We rely on a force-directed algorithm that models item response nodes as a simulated physical system
Positively correlated nodes attract one another; un-correlated (and negatively correlated) nodes repel one another
Many possible algorithms to choose from; we use the commonly used “Fruchterman-Reingold” method (main work-horse in igraph package)
Can be modeled in 2D or 3D; for simplicity we focus on 2D solution here

2.3.1 Force-directed location estimation - using igraph package

We rely on the “layout_with_fr” function from the igraph package (Nepusz et.al. 2020) https://cran.r-project.org/web/packages/igraph/index.html

## Here we take advantage of the "layout_with_fr" function taken from the igraph package
res_in_layout <- ResIN(anes_nodes)$node_frame

2.3.2 Force-directed location estimation - data-frame

x	y	node_names
-8.50	-0.21	abort_always
1.34	-0.41	abort_in_need
5.11	1.20	abort_never
2.61	-2.06	abort_other
4.94	-0.35	abort_spec_cases
-8.44	0.62	income_favor
1.81	1.25	income_neither
5.00	0.21	income_oppose
4.36	-1.84	immigr_guest_prog
-6.83	0.04	immigr_neither
6.57	1.22	immigr_send_back
5.51	0.15	welfare_decrease
-9.33	-0.45	welfare_increase
-1.56	0.12	welfare_same
4.14	-0.61	gay_mar_civ_union

2.3.3 Force-directed location estimation - plot

2.4 Coordinate rotation

We use principle components analysis (PCA) to make graphical result more interpretable
X-axis rotated s.t. it captures most of the variance within the attitude network
We use orthogonal (non-distorting) rotation but different oblique rotations would theoretically be possible as well

2.4.1 Coordinate rotation using PCA

## Performing PCA with the R-native "prcomp" function; Note that is step is no longer required as ResIN package automates this step for us
res_in_layout_pca <- prcomp(res_in_layout[,1:2])$x
res_in_layout[1:2] <- res_in_layout_pca

2.4.1 Coordinate rotation using PCA - R example

3 Integration & visualization with qgraph and igraph packages

qgraph-package
igraph-package
Visualization with ggplot

3.1 qgraph-integration

## qgraph package (Epskamp and Costantini et.al.)

resin_qgraph <- ResIN_qgraph(anes_nodes, plot_graph = TRUE,
                    qgraph_arglist = list(layout = "spring", maximum = 1, vsize = 4,
                    DoNotPlot = TRUE, sampleSize = nrow(anes_nodes),
                    title = "ANES 2020 ResIN plot made with qgraph package",
                    mar = c(3,3,3,3), normalize = FALSE))

3.2 Visualization with igraph and ggplot2

ResIN function can transform edge lists generated by igraph output to ggplot friendly data.frame
Here, we use the geom_curve() layer in ggplot for network visualization
We can add node-level covariates or network statistics to enhance our plot

3.2.1 Visualization with ggplot2 - prep work

## The ResIN function by default provides a plotting-optimized, edge-list data-frame 
edgelist_frame <- ResIN(anes_nodes)$edgelist_frame

3.2.2 Visualization with ggplot2 example data-frame

from	to	weight	from.x	from.y	to.x	to.y	x	y	node_names
abort_always	income_favor	0.48	-8.50	-0.21	-8.44	0.62	-8.50	-0.21	abort_always
abort_always	immigr_neither	0.30	-8.50	-0.21	-6.83	0.04	-8.50	-0.21	abort_always
abort_always	welfare_increase	0.42	-8.50	-0.21	-9.33	-0.45	-8.50	-0.21	abort_always
abort_always	gay_mar_recogn	0.67	-8.50	-0.21	-7.68	-0.28	-8.50	-0.21	abort_always
abort_always	environm_regul++	0.18	-8.50	-0.21	-6.15	0.07	-8.50	-0.21	abort_always
abort_always	environm_regul+++	0.55	-8.50	-0.21	-9.41	0.18	-8.50	-0.21	abort_always
abort_always	gun_own_more_diffic	0.50	-8.50	-0.21	-7.83	0.52	-8.50	-0.21	abort_always
abort_always	aid_black_yes	0.14	-8.50	-0.21	-5.05	0.08	-8.50	-0.21	abort_always
abort_always	aid_black_yes+	0.34	-8.50	-0.21	-8.29	-1.06	-8.50	-0.21	abort_always
abort_always	aid_black_yes++	0.45	-8.50	-0.21	-9.35	0.90	-8.50	-0.21	abort_always
abort_in_need	income_neither	0.04	1.34	-0.41	1.81	1.25	1.34	-0.41	abort_in_need
abort_in_need	income_oppose	0.10	1.34	-0.41	5.00	0.21	1.34	-0.41	abort_in_need
abort_in_need	immigr_guest_prog	0.06	1.34	-0.41	4.36	-1.84	1.34	-0.41	abort_in_need
abort_in_need	immigr_neither	0.01	1.34	-0.41	-6.83	0.04	1.34	-0.41	abort_in_need
abort_in_need	welfare_decrease	0.06	1.34	-0.41	5.51	0.15	1.34	-0.41	abort_in_need

3.2.3 Visualization with ggplot2 - basic plot

There are many options to plot networks with ggplot. Here, we show an option using the geom_curve function which works well with edge-list type dataframes (i.e. with from.x, from.y, to.x, to.y edge-specific information).

ANES_net_fig <- ggplot() +
    geom_curve(data = edgelist_frame, aes(x = from.x, xend = to.x, y = from.y,
                yend = to.y), curvature = 0, color = "grey") +
        ## geom_curve requires the x, xend, y and yend aesthetics to be able to plot the network edges
    geom_text(data = edgelist_frame, aes(x = from.x, y = from.y, label = from), size = 3, color = "black") +
    geom_text(data = edgelist_frame, aes(x = to.x, y = to.y, label = to), size = 3, color = "black") +
        ## Make sure to plot node labels both based on the to and from columns in the plotting frame!
    ggtitle("Belief Structure among ANES 2020 Respondents")+
    theme_bw()+
    expand_limits(x = c(min(edgelist_frame$x-2), max(edgelist_frame$x+2)),
                  y = c(min(edgelist_frame$y-2), max(edgelist_frame$y+2))) +
    theme(axis.text.x = element_blank(), axis.title.x = element_blank(),
          axis.text.y = element_blank(), axis.title.y = element_blank(),
          axis.ticks = element_blank(), panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(), legend.position = "bottom",
          legend.text = element_blank(), plot.title = element_text(hjust = 0.5))

3.2.4 Visualization with ggplot2 - basic plot

3.3.5 Visualization with ggplot2 - more fancy plot

Let’s add a few additional visual elements to enhance our plot. This step also prepares us to for multi-variate data-analysis leveraging each item-response node’s spacial position and various outside information. In this example, we show how to add information on Feeling-thermometer scores to map information on affective polarization between partisan groups onto the attitude space.

## Here, we use Democratic and & Republican feelings thermometers: V201156 and V201157

anes_timeseries_2020$V201156[anes_timeseries_2020$V201156 < 0] <- NA
anes_timeseries_2020$V201157[anes_timeseries_2020$V201157 < 0] <- NA

anes_nodes_plus <- cbind(anes_nodes, anes_timeseries_2020$V201157, anes_timeseries_2020$V201156)
colnames(anes_nodes_plus) <- c(colnames(anes_nodes), "V201156", "V201157")

## We would like to extract the mean of each thermometer variable for the subset of respondents who selected each item response:
ResIN_anes <- ResIN(anes_nodes_plus, node_vars = colnames(anes_nodes), node_covars = c("V201156", "V201157"), 
                        node_costats = c("mean", "mean"), network_stats = TRUE)

edgelist_frame <- ResIN_anes$edgelist_frame

## We can simply calculate node-level affective polarization as the distance between Democrat and Republican FT's  
edgelist_frame$aff_pola <- abs(edgelist_frame$V201156_mean - edgelist_frame$V201157_mean)

3.3.7 Visualization with ggplot2 - data-frame

from	to	weight	from.x	from.y	to.x	to.y	x	y	node	D_mean_FT	node_label	aff_pola	NA	NA	NA	NA
abort_always	income_favor	0.480	-8.498	-0.208	-8.443	0.617	-8.498	-0.208	abort_always	0.0018063	4.041088	4.041088	28.74123	59.17015	abort_always	30.43
abort_always	immigr_neither	0.302	-8.498	-0.208	-6.832	0.038	-8.498	-0.208	abort_always	0.0018063	4.041088	4.041088	28.74123	59.17015	abort_always	30.43
abort_always	welfare_increase	0.425	-8.498	-0.208	-9.325	-0.446	-8.498	-0.208	abort_always	0.0018063	4.041088	4.041088	28.74123	59.17015	abort_always	30.43
abort_always	gay_mar_recogn	0.674	-8.498	-0.208	-7.681	-0.282	-8.498	-0.208	abort_always	0.0018063	4.041088	4.041088	28.74123	59.17015	abort_always	30.43
abort_always	environm_regul++	0.183	-8.498	-0.208	-6.152	0.071	-8.498	-0.208	abort_always	0.0018063	4.041088	4.041088	28.74123	59.17015	abort_always	30.43
abort_always	environm_regul+++	0.545	-8.498	-0.208	-9.406	0.175	-8.498	-0.208	abort_always	0.0018063	4.041088	4.041088	28.74123	59.17015	abort_always	30.43
abort_always	gun_own_more_diffic	0.498	-8.498	-0.208	-7.834	0.517	-8.498	-0.208	abort_always	0.0018063	4.041088	4.041088	28.74123	59.17015	abort_always	30.43
abort_always	aid_black_yes	0.139	-8.498	-0.208	-5.052	0.078	-8.498	-0.208	abort_always	0.0018063	4.041088	4.041088	28.74123	59.17015	abort_always	30.43
abort_always	aid_black_yes+	0.345	-8.498	-0.208	-8.289	-1.063	-8.498	-0.208	abort_always	0.0018063	4.041088	4.041088	28.74123	59.17015	abort_always	30.43
abort_always	aid_black_yes++	0.452	-8.498	-0.208	-9.348	0.900	-8.498	-0.208	abort_always	0.0018063	4.041088	4.041088	28.74123	59.17015	abort_always	30.43
abort_in_need	income_neither	0.038	1.336	-0.411	1.806	1.254	1.336	-0.411	abort_in_need	0.0023380	1.424381	1.424381	48.37396	40.89372	abort_in_need	7.48
abort_in_need	income_oppose	0.098	1.336	-0.411	5.004	0.207	1.336	-0.411	abort_in_need	0.0023380	1.424381	1.424381	48.37396	40.89372	abort_in_need	7.48
abort_in_need	immigr_guest_prog	0.056	1.336	-0.411	4.362	-1.836	1.336	-0.411	abort_in_need	0.0023380	1.424381	1.424381	48.37396	40.89372	abort_in_need	7.48
abort_in_need	immigr_neither	0.006	1.336	-0.411	-6.832	0.038	1.336	-0.411	abort_in_need	0.0023380	1.424381	1.424381	48.37396	40.89372	abort_in_need	7.48
abort_in_need	welfare_decrease	0.065	1.336	-0.411	5.507	0.154	1.336	-0.411	abort_in_need	0.0023380	1.424381	1.424381	48.37396	40.89372	abort_in_need	7.48

3.3.8 Visualization with ggplot2 - more fancy plot

ANES_dark <- ggplot() +
    geom_curve(data = edgelist_frame, aes(x = from.x, xend = to.x, y = from.y,
                yend = to.y, size = weight^10), curvature = 0.15, color = "grey", alpha = 0.4) +
                  ## Note that we use the size aesthetic to control the thickness of the graph edges
    geom_text(data = edgelist_frame, aes(x = from.x, y = from.y, label = from, color = aff_pola), size = 6) +
    geom_text(data = edgelist_frame, aes(x = to.x, y = to.y, label = to, color = aff_pola), size = 6) +
                  ## Make sure to plot node labels both based on the to and from columns in the plotting frame
    ggtitle("Belief Structure among ANES 2020 Respondents")+
    expand_limits(x = c(min(edgelist_frame$x-1.5), max(edgelist_frame$x+1.5)),
                  y = c(min(edgelist_frame$y-1.5), max(edgelist_frame$y+1.5))) +
    scale_color_continuous(name = "Intensity of Partisan Affective Polarization", high = "yellow", low = "blue")+
    scale_size_continuous(guide = "none")+
    dark_theme_gray()+
    theme(axis.text.x = element_blank(), axis.title.x = element_blank(),
          axis.text.y = element_blank(), axis.title.y = element_blank(),
          axis.ticks = element_blank(), panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(), legend.position = "bottom",
          legend.text = element_blank(), plot.title = element_text(hjust = 0.5, size=18))

3.3.9 Visualization with ggplot2 - more fancy plot

3.4. Correlational analysis

Although statistical analysis with ResIN can be as complex as you want, we are simply correlating feelings thermometer scores with the spacial location of ResIN nodes. Also, we run a simple linear regression model predicting whether left-wing or right wing attitudes more easily connect to other idea elements in the belief system.

## Correlation between ResIN Left-right position and mean Republican feelings thermometer:
cor(ResIN_anes$node_frame$x, ResIN_anes$node_frame$V201156_mean)

## [1] 0.9636615

## Simple regression model predicting node centrality from squared x-position
ResIN_anes$node_frame$x_sq <- (ResIN_anes$node_frame$x-mean(ResIN_anes$node_frame$x))^2

summary(lm(ResIN_anes$node_frame$Closeness ~ ResIN_anes$node_frame$x + 
             ResIN_anes$node_frame$x_sq))

## 
## Call:
## lm(formula = ResIN_anes$node_frame$Closeness ~ ResIN_anes$node_frame$x + 
##     ResIN_anes$node_frame$x_sq)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -7.665e-04 -1.013e-04  1.499e-05  1.024e-04  4.021e-04 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 2.466e-03  8.376e-05  29.437  < 2e-16 ***
## ResIN_anes$node_frame$x     2.660e-05  9.336e-06   2.849  0.00773 ** 
## ResIN_anes$node_frame$x_sq -5.354e-06  2.123e-06  -2.522  0.01700 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0002375 on 31 degrees of freedom
## Multiple R-squared:  0.5847, Adjusted R-squared:  0.5579 
## F-statistic: 21.82 on 2 and 31 DF,  p-value: 1.216e-06

4. ResIN function advanced features

4.1 Advanced ResIN features

cor_method: Which correlation method should be used? Possible are “pearson”, “kendall”, “spearman”, and “auto”. The “auto” option calls the qgraph::corr_auto function which provides a fast implementation of the polychoric correlation coefficient for binary data. Defaults to “auto”.
weights: Optionally specify a weights vector of length n to calculate weighted correlation coefficients using the wCorr method. Defalts to NULL.
method_wCorr: If weights are used, which type of weighted correlation should be used. Options include “Pearson”, “Spearman”, “Polychoric”, and “Polyserial”. Defaults to “Polychoric”
EBICglasso: Optionally further increase the sparseness of ResIn correlation matrix by performing Gaussian graphical lasso algorithim using the Extended Baysian Information Criterion. Method calls the qgraph::EBICglasso Useful for visualizing very large ResIn graphs although the algorithm might take a while to converge. Defaults to FALSE.
EBICglasso_arglist: An arglist that can optionally be fed to the qgraph::EBICglasso function.
node_covars: Optionally select a series of co-variates to obtain node-level statistics. For example the share of female or Republican respondents selecting a particular item response. Can provide useful contextual information for plotting. Defaults to NULL.
node_costats: A character vector of the same length as node_covars detailing the summary statistic to be obtained from the co-variates on the basis of subsets of respondents who selected each item response. First element in node_costats corresponds to the summary statistic to be calculated from the first element in node_covars. Possible stats are “mean”, “median”, “mode”, “sd”, and “var”.
network_stats: Should node level and network level structural descriptive statistics be obtained? If set to TRUE, the function obtains betweenness-, closeness-, and strength centrality, as well as expected influence values for each ResIn network node. At the graph level, the function calculates betweenness-, closeness-, strength- and expected influence centralization, as well as the graph average path length and the global clustering coefficient. Defaults to FALSE.

4.2.1 ResIN function returns

adj_matrix: The k*k ResIn adjacency matrix where k are all unique item responses across all items
same_items: A character vector of length k grouping the ResIn nodes by the original items they pertain to. Useful for grouped plotting of ResIn items
edgelist_frame: A ggplot friendly edge-list dataframe. The dataframe has as many rows as ResIn network edges. Features plotting coordinates for all graph edges and nodes as well as node-level network statistics if called for by the main function.
node_frame: A ggplot friendly node-list dataframe. The dataframe has as many rows as ResIn network nodes. Features plotting coordinates for all nodes as well as node-level network statistics if called for by the main function.
graph_structuration: Named vector containing graph-level structuration statistics.
graph_centralization: Named vector containing graph-level centralization statistics.

4.2.2 Let’s test things out using ESS data!

Select the following issue attitudes from wave 9 of the ESS:
- “gincdif” (Government should reduce income differences)
- “freehms” (Gays and lesbians should live their lifes as they please)
- “impcntr” (Immigration is good for the country)
- “ipstrgv” (Need strong government that ensures safety)
- “impenv” (People should care for the environment)
Filter by country Switzerland (cntry==“CH”)
Select “lrscale” (people’s left-right placements) as a node_covars and specify “mean” as node_costats
Plot the network using mean “lrscale” as a color aesthetic

## Importing ESS 2018 dataset and selecting all relevant variables
ESS_9 <- read.csv("ESS_9.csv")

ESS_9_n <- ESS_9 %>% dplyr::select(cntry, lrscale, gincdif, freehms, impcntr,
                                   ipstrgv, ipeqopt, impenv)

## Sub-setting only Austrian respondents
ESS_9_ch <- ESS_9_n[ESS_9_n$cntry=="CH", ]

4.3 Interactive visualization with visNetwork package

Here, we rely on the visNetwork package to provide a java-script based, HTML implementation for visualizing the ResIn network.
visNetwork models a force-directed algorithm in real time; the package allows for displaying connected edged and node-level statistics upon mouse hovering.

4.3.1 ANES 2020 Interactive Plot

## Very simple example employing the ResIn_gen function to ANES data: 
ResIN_ANES <- ResIN(anes_nodes, EBICglasso = TRUE, network_stats = TRUE)

## Here we need to do just a little bit of prep-work for the visNetwork package
    ## At the node level, visNetwork minimally needs a unique id column.
      ## Additional columns such as "label", "group", and "title" are optional but very useful for interactive plotting
ResIN_ANES$node_frame$id <- ResIN_ANES$node_frame$node_names
ResIN_ANES$node_frame$label <- ResIN_ANES$node_frame$node_names
ResIN_ANES$node_frame$group <- ResIN_ANES$same_items

    ## The "title" column allows for a flexible HTML pop-up window as part of the integration. 
      ## Here, we customize this window with network descriptive statistics:
ResIN_ANES$node_frame$title <- paste0("<center> <strong>", ResIN_ANES$node_frame$label, "</strong> </center> <br>", 
                                          "Strength centrality: ", round(ResIN_ANES$node_frame$Strength, 2), "<br>",
                                          "Betweeness centrality: ", ResIN_ANES$node_frame$Betweenness, "<br>",
                                          "Closeness centrality: ", round(ResIN_ANES$node_frame$Closeness, 3) , "<br>",
                                          "Expected influence: ", round(ResIN_ANES$node_frame$ExpectedInfluence, 2), "<br>")

## At the edge-level, visNetwork minimally requires a "from" and "to" vector. 
    ## Here we additionally specify the edge weights as "value" collumn.
ResIN_ANES$edgelist_frame$value <- ResIN_ANES$edgelist_frame$weight

## Note how we keep the visualization as sparse as possible by specifying a 0-degree nearest neighbor highlighting option:
ResIN_ANES_plot <- visNetwork(ResIN_ANES$node_frame, ResIN_ANES$edgelist_frame,
                                  main = "Interactive ANES 2020 Belief System Visualization") %>% 
  visPhysics(solver = "forceAtlas2Based", forceAtlas2Based = list(stabilization = 10)) %>% 
  visOptions(highlightNearest = list(enabled = TRUE, degree = 0, hover = TRUE)) %>%
  visNodes(size = 24) %>% visInteraction(hideEdgesOnDrag = TRUE)

saveRDS(ResIN_ANES_plot, "ResIN_ANES_plot.rds")

V200002	V200003	V200004	V200005	V200006	V200007	V200008
3	2	3	0	-2	-2	-2
3	2	3	0	4	-1	3
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	4	-1	3
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	4	-1	3

V200002	V200003	V200004	V200005	V200006	V200007	V200008
3	2	3	0	-2	-2	-2
3	2	3	0	4	-1	3
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	4	-1	3
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	4	-1	3

ResIN R Workshop Bremen CSS 2023

2023-07-05

Teaser I

Teaser II

Teaser III

Workshop contents

Workshop prerequisites

1. Data preparation & cleaning

1.1 Data preparation - before

1.2 Data preparation - recode missing

1.4 Data preparation - adding lables

1.5 Data preparation - after

2. Main Res-IN algorithim

2.1.1 Item binarization

2.1.2 Item binarization - example

2.2 Res-IN adjacency matrix

2.2.1 ResIN (sparse) correlation matrix (no within-item correlations)

2.2.2 Zero-out all remaining negative correlations (default & strongly recommended)

2.3 Force-directed location estimation

2.3.1 Force-directed location estimation - using igraph package

2.3.2 Force-directed location estimation - data-frame

2.3.3 Force-directed location estimation - plot

2.4 Coordinate rotation

2.4.1 Coordinate rotation using PCA

2.4.1 Coordinate rotation using PCA - R example

3 Integration & visualization with qgraph and igraph packages

3.1 qgraph-integration

3.2 Visualization with igraph and ggplot2

3.2.1 Visualization with ggplot2 - prep work

3.2.2 Visualization with ggplot2 example data-frame

3.2.3 Visualization with ggplot2 - basic plot

3.2.4 Visualization with ggplot2 - basic plot

3.3.5 Visualization with ggplot2 - more fancy plot

3.3.7 Visualization with ggplot2 - data-frame

3.3.8 Visualization with ggplot2 - more fancy plot

3.3.9 Visualization with ggplot2 - more fancy plot

3.4. Correlational analysis

4. ResIN function advanced features

4.1 Advanced ResIN features

4.2.1 ResIN function returns

4.2.2 Let’s test things out using ESS data!

4.3 Interactive visualization with visNetwork package

4.3.1 ANES 2020 Interactive Plot

4.3.2 ANES 2020 Interactive Plot

V200002	V200003	V200004	V200005	V200006	V200007	V200008
3	2	3	0	-2	-2	-2
3	2	3	0	4	-1	3
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	4	-1	3
3	2	3	0	-2	-2	-2
3	2	3	0	-2	-2	-2
3	2	3	1	4	-1	3