Bulletin of the American Physical Society

APS April Meeting 2023

Volume 68, Number 6

Minneapolis, Minnesota (Apr 15-18)
Virtual (Apr 24-26); Time Zone: Central Time

Session B18: Theoretical Frameworks and Methodologies

Sponsoring Units: GPER
Chair: Jennifer Blue, Miami University
Room: Marquette IX - 2nd Floor

Saturday, April 15, 2023 10:45AM - 10:57AM	B18.00001: Automating motivational coding of CUWiP responses using natural language processing approaches Colin Green, Eric Brewe As a component of the evaluation of the Conferences for Undergraduate Women in Physics (CUWiP) we have been investigating motivating factors for women that choose to major in physics. Previous work, headed by Franklin, developed a hand-coding scheme for motivation based on theories of self-efficacy and expectancy value. This coding identified motivational factors women expressed for entering physics: Physiological/Emotional, Vicarious Experience, Mastery Experience, Intrinsic Value, Intrinsic Value (Astronomy), Social Intrinsic Value, Utility Value, Media Triggered Intrinsic Value, Event Triggered Intrinsic Value, Attainment Value, Social Persuasion. Franklin et al, also identified costs associated with majoring in physics: Cost (Emotional), Cost (Task Effort), Cost (Loss of Valued Alternatives), and Cost (Outside Effort). This coding was applied to a year's worth of CUWiP responses to a question asking the attendees their motivation for choosing physics. A total of 2125 responses were hand coded. We are in the process of developing a machine learning approach, using natural language processing, to identify the motivational codes of these responses from other years of CUWiP. Preliminary efforts have utilized word vectorizers and binary logistic regression methodologies found within the sklearn python package. Logistic regression application provides 80% or higher accuracy in identifying the codes.
Saturday, April 15, 2023 10:57AM - 11:09AM	B18.00002: Why the machine (dis)agrees: understanding uncertainty in natural language processing classifications Rebeckah Fussell, Natasha G Holmes As interest increases in using natural language processing methods ("machine coding") to supplant labor-intensive human coding of survey responses, the physics education research community needs methods to determine the accuracy and reliability of machine coding. Existing literature uses measures of agreement between human and machine coding (e.g., Cohen's kappa) to assess machine coding. We need to understand the cause of underlying agreement/disagreement, not simply the level, however, if we are ever to trust a machine learning algorithm's codes without a thorough comparison to human coding. For response datasets to a few different survey questions, we will present data on the uncertainty levels of machine coding as a function of i) training set characteristics and ii) test set characteristics and discuss the underlying causes of these uncertainty levels. We describe the conditions in which we can use these uncertainty measurements to form trustworthy conclusions from machine coded data.
Saturday, April 15, 2023 11:09AM - 11:21AM	B18.00003: Effects of Anchor Item Choices on Bias on the Force Concept Inventory across the intersection of gender and race John B Buncher, Jayson M Nissen, Ben Van Dusen, Robert M Talbot Education researchers often compare performance across race and gender on research-based assessments of physics knowledge to investigate the impacts of racism and sexism on physics student learning. These investigations' claims rely on research-based assessments providing reliable, unbiased measures of student knowledge across social identity groups. We used item response theory (IRT) and differential item functioning (DIF) analysis to examine whether the items on the Force Concept Inventory provided unbiased data across social identifiers for race, gender, and their intersections. A crucial choice in any IRT/DIF analysis is what items to "anchor" across groups, that is, which items are assumed to behave the same. Here we discuss how the choice of anchors significantly alters the results of any such analysis, the assumptions each choice of anchors carries with it, and how these choices affect the interpretation of the results.
Saturday, April 15, 2023 11:21AM - 11:33AM	B18.00004: Assessing introductory physics students using large open item bank created using GPT-3 and Wolfram Alpha Zhongzhou Chen Most assessments in introductory physics are based on secure assessment items, which are supposed to be revealed to the test takers only at the time of the assessments. However, the rise of resource sharing websites such as Chegg is making it increasingly difficult and expensive to maintain item security. This pilot study explores a different assessment method based on large open isomorphic item banks. Isomorphic items are created by systematically varying features of a template problem, which can be done efficiently by training natural language model GPT-3 to write problem text, and using Wolframalpha to generate solutions. One problem bank of 45 isomorphic problems was created and open to students for practice, and on a following exam one problem was randomly chosen from the bank. A second problem which was also isomorphic but not included in the bank was also given on the same exam. The correct rate of the first problem was 23% and the second was 19%, with a correlation coefficient of 0.47. I will also compare Item response theory properties of those items, and the relation between student practice strategy and performance. This new assessment scheme has three major potential advantages. First, it will significantly reduce incentives for students to use websites such as Chegg. Second, item banks can be openly shared among instructors, leading to lower creation cost, higher quality, and comparable outcomes. Finally, assessments can be administered asynchronously, more frequently, and allow multiple attempts.
Saturday, April 15, 2023 11:33AM - 11:45AM	B18.00005: Consistency of item response theory results between data sets Trevor I Smith, Nasrine Bendjilali Analyses of data from multiple-choice tests typically begin by scoring each response dichotomously as either correct or incorrect. Dichotomous scoring facilitates many forms of item-level and test-level analyses, and leads easily to reporting student scores based on the number of items answered correctly. Dichotomous scoring also destroys any information that may be learned by examining which particular incorrect responses students select. This destruction of information is particularly relevant for data collected using research-based assessments, on which incorrect response options often correspond to specific commonly held ideas. We have previously used nominal response models (NRM) from item response theory to rank incorrect responses to items on one such test (the Force and Motion Conceptual Evaluation, FMCE), and we have argued that specific NRM parameters indicate how close any particular incorrect response is to the correct response. We present evidence of the consistency of these rankings and the parameter values between two large data sets (~6000 students each). We show that a one-dimensional model that treats the FMCE as measuring a single test construct is inadequate, and we discuss promising avenues for more complex analyses.
Saturday, April 15, 2023 11:45AM - 11:57AM	B18.00006: Applying Causal Inference Principles to the Analysis of Observational Studies in Physics Education Vidushi Adlakha, Eric Kuo Numerous quantitative observational studies in the field of physics education research aim to determine the causal relationship between various student, classroom, and instructional factors. However, various errors can occur in estimating causal relationships among highly correlated variables, including those due to confounding, omitted variables, reverse causality, and selection. Many such errors can be understood in a unified way through a set of causal inference principles applied to causal network diagrams developed and growing in usage in other fields, such as medicine and the social sciences. Three fundamental causal structures – chain, fork, collider – and a set of rules for interpreting analyses under these structures can precisely model when these analytic errors are present. We examine how these analytic principles can detect potential errors when making causal predictions of the effects of potential interventions from observational studies.
Saturday, April 15, 2023 11:57AM - 12:09PM	B18.00007: Theoretical Considerations When Predicting the Outcome of Future Educational Interventions from Observational Studies Eric Kuo, Vidushi Adlakha Recent developments in quantitative causal inference provide tools for modeling the mechanistic pathways through which interventions can impact students' educational experiences and outcomes. These tools can be applied when creating and interpreting causal network diagrams, where measured variables are nodes connected by directed links. Using these causal inference tools, we propose some key considerations that researchers should address when proposing educational interventions based on observational studies. These include: (i) considering multiple alternative models that make different causal predictions, (ii) specifying how interventions can plausibly impact specific nodes, and (iii) specifying how interventions can produce new mediations and moderations in a causal network. These considerations can be explicitly addressed by specifying the parameters of proposed future intervention studies and, later, comparing the results of those future studies to past observational results.
Saturday, April 15, 2023 12:09PM - 12:21PM	B18.00008: Draft of Referee Guidelines to maintain clarity and generativity of quantitative research studies in Physical Review Physics Education Research Tim J Stelzer, Eric Brewe, Andrew F Heckler, Rachel J Henderson, Natasha G Holmes, Eric Kuo Physics education research faces statistical analysis challenges that are relatively unique in the physics community. Currently, there are not clear standards for how these challenges should be addressed by researchers and evaluated by peer reviewers in Physical Review Physics Education Research (PRPER). In an effort to move towards more consistent publication standards, the PRPER editorial board formed a Statistical Modeling Review Committee. The charge of this committee is to identify these key challenges and lead the PER community in creating a set of review guidelines to help researchers and referees maintain consistent publication standards for quantitative research. Overall, these review guidelines aim to ensure that authors provide justified interpretations of their studies and share enough information about their studies to encourage replication, extension, and meta-analysis. This talk will provide a preliminary draft of some of these guidelines with the goal of promoting discussion and getting feedback from members of the community.

About APS

The American Physical Society (APS) is a non-profit membership organization working to advance the knowledge of physics.

Headquarters 1 Physics Ellipse, College Park, MD 20740-3844 (301) 209-3200
Editorial Office 100 Motor Pkwy, Suite 110, Hauppauge, NY 11788 (631) 591-4000
Office of Public Affairs 529 14th St NW, Suite 1050, Washington, D.C. 20045-2001 (202) 662-8700

Bulletin of the American Physical Society

APS April Meeting 2023

Volume 68, Number 6

Minneapolis, Minnesota (Apr 15-18)Virtual (Apr 24-26); Time Zone: Central Time

Session B18: Theoretical Frameworks and Methodologies

Follow Us

Engage

My APS

Information for

About APS

Minneapolis, Minnesota (Apr 15-18)
Virtual (Apr 24-26); Time Zone: Central Time