Media
-

User-Generated Ratings in Healthcare-Evidence from Yelp

Yiwei Chen

Advisor: Kate Bundorf

Abstract: It is controversial whether user-generated physician ratings from online sources improve healthcare efficiency. Using the universe of Yelp physician ratings matched with Medicare claims, I examine what information on physician quality Yelp ratings reveal, whether they affect patients' physician choices, and how they change physician behaviors. Through text and correlational analysis, I show that although Yelp reviews primarily describe physicians’ interpersonal skills, Yelp ratings are also positively correlated with various measures of clinical quality. Instrumenting physicians’ average ratings with reviewers' “harshness” in rating other businesses, I discover that physicians’ average ratings increase their revenue and patient volume by 1-2% per star. Using a difference-in-differences strategy, I find that after their physicians are rated on Yelp, patients do not receive different amounts of opioid prescriptions or show different health outcomes, although they have slightly more lab and imaging tests which are possibly wasteful. Overall, Yelp ratings seem to help patients—they convey both physicians' interpersonal skills and clinical abilities, bring patients into higher-rated physicians, and do not induce physicians to hurt patients’ health via ordering harmful substances.

William J. Perry Conference Room

Encina Hall

2nd Floor

616 Serra Mall (Address changed due to construction)

Stanford, CA 94305

Seminars

615 Crothers Way Encina Commons, MC6019
Stanford University
Stanford, CA 94305-6006

0
Adjunct Affiliate, Stanford Health Policy
Adjunct Professor, Stanford School of Medicine
Adjunct Lecturer, Stanford Graduate School of Education
Faculty Fellow, Stanford Center for Innovation in Global Health
Founder and CEO, TeachAids
piyasorcar_highres.jpeg
PhD, MA

Dr. Piya Sorcar is the founder and CEO of TeachAids, an Adjunct Professor at Stanford’s School of Medicine, and an Adjunct Lecturer at the Graduate School of Education. She leads a team of world experts in medicine, public health, and education to address some of the most pressing public health challenges.


TeachAids is an award-winning 501(c)(3) nonprofit social venture that creates breakthrough software addressing numerous persistent problems in health education around the world, including HIV/AIDS, concussion, and COVID-19. A pioneer in the development of infectious disease education, TeachAids HIV education software is used in 82 countries. In partnership with the US Olympic Committee’s National Governing Bodies, TeachAids has launched the CrashCourse concussion education product suite, which includes research-based applications available online as a standard video and in virtual reality. CoviDB is their third health education initiative, a community-edited platform organizing resources across a comprehensive set of topics relating to COVID-19 for free public use.

Sorcar received her Ph.D. in Learning Sciences and Technology Design and her M.A. in Education from Stanford University. She graduated summa cum laude from the University of Colorado at Boulder with a B.A. in Economics, B.S. in Journalism, and B.S. in Information Systems. She has been an invited speaker at leading universities such as Columbia, Johns Hopkins, Tsinghua, and Yale, and is Chair of the Education Advisory Council for USA Football. MIT Technology Review named her to its TR35 list of the top 35 innovators in the world under 35 and she was the recipient of Stanford’s Alumni Excellence in Education Award.

Date Label
Paragraphs

Abstract

OBJECTIVE:

The diagnosis of bipolar spectrum disorders (BPSDs [bipolar I and II disorders, cyclothymic disorder, and bipolar disorder not otherwise specified]) in youth remains controversial. The present study evaluated the possibility that the presence of persistent manic symptoms over a relatively short interval may increase the probability of a BPSD DSM diagnosis.

METHOD:

Data were obtained from the screening and baseline assessments collected from 2005 through 2008 of an ongoing prospective, longitudinal study (Longitudinal Assessment of Manic Symptoms) examining the diagnosis and phenomenology of youth (N = 692) presenting to outpatient centers at ages 6-12 years. Youth were assessed for elevated symptoms of mania (ESM) with the Parent General Behavior Inventory-10-Item Mania Scale (PGBI-10M), the primary outcome measure. Screening and baseline scores separated individuals into those with ESM (ESM+; PGBI-10M score ≥ 12) and a control group of youth without ESM (ESM-; PGBI-10M score < 12). Youth were classified into 4 groups: persistent ESM+, remitted ESM+, persistent ESM-, and progressed to ESM+.

RESULTS:

Individuals with persistent ESM+ were more likely to have a BPSD (relative risk = 3.04; 95% CI, 2.15-4.30). Using 2 administrations of the PGBI-10M spaced over a relatively brief interval (median = 4.0, mean = 6.1, SD = 5.9 weeks) improved the prediction of BPSD over using only the first administration (ΔR(2) = 0.10, Δχ(2)(1) = 50.06, P < .001). Likelihood ratios indicated that persistent ESM- substantially decreased the probability of BPSD. While high levels of persistent ESM+ increased the probability of a BPSD diagnosis, the final positive predictive value was only sufficient to signify the need for more thorough clinical evaluation.

CONCLUSIONS:

In many cases, obtaining repeated parent report of mania symptoms substantially altered the probability of a BPSD diagnosis and may be a useful adjunct to a careful clinical evaluation. Future waves of data collection from this longitudinal study will be crucial for devising clinically useful methods for identifying or ruling out pediatric BPSD.

All Publications button
1
Publication Type
Journal Articles
Publication Date
Journal Publisher
Journal of Clinical Psychiatry
Authors
Frazier TW
Youngstrom EA
Sarah (Sally) M. Horwitz
Demeter CA
Fristad MA
Fristad MA
Arnold LE
Arnold LE
Birmaher B
Kowatch RA
Axelson D
Ryan N
Gill MK
Findling RL
Paragraphs

Background: Compared with women aged 50-69 years, the lower sensitivity of mammographic screening in women aged 40-49 years is largely attributed to the lower mammographic tumor detectability and faster tumor growth in the younger women.

Methods: We used a Monte Carlo simulation model of breast cancer screening by age to estimate the median tumor size detectable on a mammogram and the mean tumor volume doubling time. The estimates were calculated by calibrating the predicted breast cancer incidence rates to the actual rates from the Surveillance, Epidemiology, and End Results (SEER) database and the predicted distributions of screen-detected tumor sizes to the actual distributions obtained from the Breast Cancer Surveillance Consortium (BCSC). The calibrated parameters were used to estimate the relative impact of lower mammographic tumor detectability vs faster tumor volume doubling time on the poorer screening outcomes in younger women compared with older women. Mammography screening outcomes included sensitivity, mean tumor size at detection, lifetime gained, and breast cancer mortality. In addition, the relationship between screening sensitivity and breast cancer mortality was investigated as a function of tumor volume doubling time, mammographic tumor detectability, and screening interval.

Results: Lowered mammographic tumor detectability accounted for 79% and faster tumor volume doubling time accounted for 21% of the poorer sensitivity of mammography screening in younger women compared with older women. The relative contributions were similar when the impact of screening was evaluated in terms of mean tumor size at detection, lifetime gained, and breast cancer mortality. Screening sensitivity and breast cancer mortality reduction attributable to screening were almost linearly related when comparing annual or biennial screening with no screening. However, when comparing annual with biennial screening, the greatest reduction in breast cancer mortality attributable to screening did not correspond to the greatest gain in screening sensitivity and was more strongly affected by the mammographic tumor detectability than tumor volume doubling time.

Conclusion: The age-specific differences in mammographic tumor detection contribute more than age-specific differences in tumor growth rates to the lowered performance of mammography screening in younger women.

All Publications button
1
Publication Type
Journal Articles
Publication Date
Journal Publisher
Journal of the National Cancer Institute
Authors
Stephanie Rutledge (Bailey)
Sigal B
Sylvia K. Plevritis
Sylvia Plevritis
Paragraphs

Background: Despite trials of mammography and widespread use, optimal screening policy is controversial.

Objective: To evaluate U.S. breast cancer screening strategies.

Design: 6 models using common data elements.

Data Sources: National data on age-specific incidence, competing mortality, mammography characteristics, and treatment effects.

Target Population: A contemporary population cohort.

Time Horizon: Lifetime.

Perspective: Societal.

Interventions: 20 screening strategies with varying initiation and cessation ages applied annually or biennially.

Outcome Measures: Number of mammograms, reduction in deaths from breast cancer or life-years gained (vs. no screening), false-positive results, unnecessary biopsies, and overdiagnosis.

Results of Base-Case Analysis: The 6 models produced consistent rankings of screening strategies. Screening biennially maintained an average of 81% (range across strategies and models, 67% to 99%) of the benefit of annual screening with almost half the number of false-positive results. Screening biennially from ages 50 to 69 years achieved a median 16.5% (range, 15% to 23%) reduction in breast cancer deaths versus no screening. Initiating biennial screening at age 40 years (vs. 50 years) reduced mortality by an additional 3% (range, 1% to 6%), consumed more resources, and yielded more false-positive results. Biennial screening after age 69 years yielded some additional mortality reduction in all models, but overdiagnosis increased most substantially at older ages.

Results of Sensitivity Analysis: Varying test sensitivity or treatment patterns did not change conclusions.

Limitation: Results do not include morbidity from false-positive results, patient knowledge of earlier diagnosis, or unnecessary treatment.

Conclusion: Biennial screening achieves most of the benefit of annual screening with less harm. Decisions about the best strategy depend on program and individual objectives and the weight placed on benefits, harms, and resource considerations.

All Publications button
1
Publication Type
Journal Articles
Publication Date
Journal Publisher
Annals of Internal Medicine
Authors
Mandelblatt JS
Stephanie Rutledge (Bailey)
et al
Paragraphs

OBJECTIVE: Evaluate KNAVE-II, a knowledge-based framework for visualization, interpretation, and exploration of longitudinal clinical data, clinical concepts and patterns. KNAVE-II mediates queries to a distributed temporal-abstraction architecture (IDAN), which uses a knowledge-based problem-solving method specializing in on-the-fly computation of clinical queries. METHODS: A two-phase, balanced cross-over study to compare efficiency and satisfaction of a group of clinicians when answering queries of variable complexity about time-oriented clinical data, typical for oncology protocols, using KNAVE-II, versus standard methods: both paper charts and a popular electronic spreadsheet (ESS) in Phase I; an ESS in Phase II. The measurements included the time required to answer and the correctness of answer for each query and each complexity category, and for all queries, assessed versus a predetermined gold standard set by a domain expert. User satisfaction was assessed by the Standard Usability Score (SUS) tool-specific questionnaire and by a "Usability of Tool Comparison" comparative questionnaire developed for this study. RESULTS: In both evaluations, subjects answered higher-complexity queries significantly faster using KNAVE-II than when using paper charts or an ESS up to a mean of 255 s difference per query versus the ESS for hard queries (p=0.0003) in the second evaluation. Average correctness scores when using KNAVE-II versus paper charts, in the first phase, and the ESS, in the second phase, were significantly higher over all queries. In the second evaluation, 91.6% (110/120) of all of the questions asked within queries of all levels produced correct answers using KNAVE-II, opposed to only 57.5% (69/120) using the ESS (p<0.0001). User satisfaction with KNAVE-II was significantly superior compared to using either a paper chart or the ESS (p=0.006). Clinicians ranked KNAVE-II superior to both paper and the ESS. CONCLUSIONS: An evaluation of the functionality and usability of KNAVE-II and its supporting knowledge-based temporal-mediation architecture has produced highly encouraging results regarding saving of physician time, enhancement of accuracy of clinical assessment, and user satisfaction.

All Publications button
1
Publication Type
Journal Articles
Publication Date
Journal Publisher
Artificial Intelligence in medicine
Authors
Susana B. Martins
Shahar Y
Goren-Bar D
Galperin M
Kaizer H
Basso LV
McNaughton D
Mary K. Goldstein
Mary K. Goldstein
Paragraphs

Six cases of coagulase-negative staphylococcal mediastinitis were identified in the latter half of 1999. A new preoperative cleansing solution was suspected by hospital staff to be a factor in the outbreak. We evaluated this possible risk factor along with other known and suspected surgical site infection risk factors in this case-control study.

All Publications button
1
Publication Type
Journal Articles
Publication Date
Journal Publisher
Infectious Control and Hospital Epidemiology
Authors
MD Van Kerkhove
Julie Parsonnet
Julie Parsonnet
M Weingart
LS Tompkins
Paragraphs

Although gastric hypochlorhydria is a risk factor for gastroenteritis and for gastric cancer, no reliable, inexpensive, noninvasive test exists for screening or epidemiologic studies. We aimed to evaluate the sensitivity and specificity of the blood quininium resin test (bQRT) for hypochlorhydria, against pH monitoring. Twelve fasting adult volunteers-seven with and five without H. pylori infection-ingested 80 mg/kg of quininium resin twice, once with and once without acid suppression. Gastric pH was monitored for 75 minutes; serum samples were obtained at times 0 and 75 minutes. The bQRT levels were compared to gastric pH, controlling for omeprazole use and H. pylori infection. Subjects with a median recorded pH > or =3.5 were considered hypochlorhydric. Using a bQRT level of 10 as a cutoff for hypochlorhydria, the sensitivity and specificity of the bQRT were 100% and 37.5%, respectively. The bQRT predicted omeprazole use more accurately than pH monitoring. In conclusions, The bQRT has a high sensitivity for hypochlorhydria, making it potentially useful in populations with a high prevalence of hypochlorhydria. In its current formulation, the bQRT's low specificity makes it less useful in low-risk population.

All Publications button
1
Publication Type
Journal Articles
Publication Date
Journal Publisher
Digestive Diseases and Sciences
Authors
C de Martel
S Ratanasopa
D Passaro
Julie Parsonnet
Julie Parsonnet
Subscribe to Media