Codebook

This codebook documents the survey instrument used in the Trust in AI Commerce Report v1. Every variable in the published dataset is paired here with its exact question wording, response options, value labels, universe rules, and derivation logic where applicable.

A reader who has both this codebook and the dataset can reproduce every statistic in the published report.

Study Summary

Field	Value
Field date	2026-04-27
Field duration	22 hours
Survey platform	Alchemer
Panel provider	Cint
Sample frame	U.S. general population (online shoppers, 18+)
Total records collected	3,638
Disqualified at screener	1,778
Partial completions	397
Complete responses (analysis basis)	1,463
Languages	English (U.S.)
Median completion time	~8 minutes
License	CC BY 4.0

For full methodology — sampling design, screener logic, weighting, integrity flags, and limitations — see the Methodology entity or §7 of the published report.

File Structure

The published dataset.csv contains:

3,638 rows — all dispositions retained for screen-out auditability. Filter to Status = "Complete" to reproduce the n=1,463 analysis basis.
146 columns — the full Alchemer export minus 7 columns stripped for privacy (see below).

Privacy Strips

Per Product.ai Research's data-publication standard, the following 7 columns were removed before public release. All other 146 columns are published as Alchemer exported them.

Stripped column	Reason
IP Address	Network identifier
Longitude	Geolocation
Latitude	Geolocation
Country	Geolocation
City	Geolocation
State/Region	Geolocation
Postal	Geolocation

How to Read This Codebook

Question wording in the column headers is exactly as shown to respondents in the Alchemer instrument. Each variable entry below documents:

Column header — the CSV column name (matches what's shown to respondents, abbreviated where useful)
Variable type — single-select, multi-select, ordinal scale, free text, derived
Response set — full list of value labels
Universe — which respondents saw the question (skip-logic notes)
Notes — derivation rules and reporting caveats

Conventions

Matrix questions (e.g., the 18-category Purchase Research Confidence Index) appear as one column per row of the matrix, with the row label prepended to the question stem. The cell value is the response option chosen for that row.

Multi-select questions appear as one column per selectable option. The cell value is the option label if selected, blank otherwise.

Numeric values in scale columns (2, 3, 5, 6) are unlabeled mid-points on a 7-point Likert with verbal anchors at 1, 4, and 7.

Mixed-format encoding correction (apply before computing means)

The raw Alchemer export mixes text labels and numeric values on the same column for three endpoint-anchored scale questions: PC-1 (last-purchase confidence, 0–10 scale), AT-3 (AI satisfaction, 1–7 scale), and AT-7 (review trust, 1–7 scale). Endpoint and midpoint positions carry text labels; intermediate positions carry numeric codes. A naive mean computation that does not remap the text labels will silently drop the highest-frequency endpoint clusters from the average and understate the result. The §7.9 Limitations note in the report documents the corrected published values; the remap rules used to compute those corrected means are restated below for replicators working directly from the raw dataset.

PC-1 — "For that purchase, how confident were you that you found the best product for your needs before completing the purchase?" (0–10 scale)

Apply this mapping before averaging:

Raw value (as exported)	Mapped numeric value
`"Not at all confident"`	0
`1`	1
`2`	2
`3`	3
`4`	4
`"Neutral"`	5
`6`	6
`7`	7
`8`	8
`9`	9
`"Extremely confident"`	10

Compute the mean across the full non-missing column. Correct value: 6.95 / 10 (n = 1,453).

AT-3 — "How satisfied were you with the AI assistant's help?" (1–7 scale; AI users only)

Raw value (as exported)	Mapped numeric value
`"Not at all satisfied"`	1
`2`	2
`3`	3
`"Neutral"`	4
`5`	5
`6`	6
`"Extremely satisfied"`	7

Compute the mean across all AI users (Q12 = "Yes"). Correct value: 5.46 / 7 (n = 623).

AT-7 — "How confident are you that online product reviews accurately reflect product quality?" (1–7 scale; full sample)

Raw value (as exported)	Mapped numeric value
`"Not at all confident"`	1
`2`	2
`3`	3
`"Neutral"`	4
`5`	5
`6`	6
`"Extremely confident"`	7

The AT-7 question was shown to all respondents (n = 1,462 non-missing), not filtered to review users. The report's cited mean uses the actual-review-user denominator — respondents who selected "Online customer reviews" in PC-4 (n = 728) — and produces a corrected value of 4.92 / 7. The full-sample mean is 4.66 / 7 (n = 1,462) if a broader denominator is required.

Implementation note for replicators

In Python, the canonical remap is:

REMAP_0_10 = {"Not at all confident": 0, "Neutral": 5, "Extremely confident": 10}
REMAP_1_7  = {"Not at all confident": 1, "Not at all satisfied": 1, "Neutral": 4,
              "Extremely confident": 7, "Extremely satisfied": 7}

def to_num(v, remap):
    if v is None: return None
    if isinstance(v, (int, float)): return v
    return remap.get(str(v).strip())

Apply REMAP_0_10 to the PC-1 column; apply REMAP_1_7 to AT-3 and AT-7. Filter the dataset to Status == "Complete" for the 1,463-row analysis basis before averaging.

A. Administrative & Metadata

Columns Alchemer auto-generates for every response. These are present in the published CSV in their original form (except the 7 privacy strips noted above).

#	Column	Type	Notes
1	`Response ID`	Integer	Alchemer's internal response identifier. Unique per row.
2	`Time Started`	Timestamp	When the respondent first opened the survey.
3	`Date Submitted`	Timestamp	When the respondent submitted the survey (or when Alchemer last touched the partial).
4	`Status`	Categorical	`Complete` / `Disqualified` / `Partial`. Filter to `Complete` (n=1,463) for analysis.
5	`Contact ID`	String	Cint panelist identifier as supplied via panel integration.
6	`Legacy Comments`	Free text	Mostly empty. Alchemer legacy field.
7	`Comments`	Free text	Mostly empty. Open respondent notes — never required, rarely used.
8	`Language`	String	Survey language. `English`.
9	`Referer`	String	HTTP referrer at survey entry.
10	`SessionID`	String	Alchemer session identifier.
11	`User Agent`	String	Browser/device user-agent string.
12	`Tags`	String	Alchemer audience tags.
20	`RID`	UUID	Cint Respondent ID (UUID format).

B. Screening — Eligibility & Anchor (Q1)

Block code: PC (Purchase Confidence)

#	Column	Type	Response Set
21	`What is your age?`	Single-select	`Under 18`, `18–24`, `25–34`, `35-44`, `44-54`, `55-64`, `65 or older`

Screener rule. Respondents who selected Under 18 were disqualified out of the survey at this point. The complete sample is 18+.

Note on encoding. The age bands as exported use inconsistent dash characters (some en-dash –, some hyphen -, and one band reads 44-54 rather than 45-54). These are preserved as exported. Analysts deriving age bands should treat 35-44, 44-54, 55-64, 65 or older as the four upper bands; the 44-54 label is interpreted as ages 45–54 (one-year overlap with the prior band reflects the exported labeling, not the underlying assignment).

C. Shopping Mode (Q2)

#	Column	Type	Response Set
22	`Where do you make your purchases when you shop for products?`	Single-select	`I shop only online`, `Mostly online, some in physical stores`, `50/50 online and physical stores`, `I shop only in physical stores`

Screener rule. Respondents who selected I shop only in physical stores were disqualified — the panel anchor is online shoppers.

D. Category Purchases — Past 12 Months (Q3)

Multi-select. 18-category × selected/not-selected. Cell value is the category label when selected, blank when not.

#	Column header (abbreviated)
23	`Apparel (shoes, clothing, accessories)`
24	`Electronics (smart phone, computer, tablet)`
25	`Video games (gaming console, games)`
26	`Skincare (lotion, supplements, sunscreen)`
27	`Beauty (makeup, cosmetics)`
28	`Baby products/child safety (stroller, car seat, baby gate)`
29	`Supplements/health products (vitamins, protein powder, first aid kit)`
30	`Kitchen appliances (air fryer, blender, coffee maker)`
31	`Mattress/bedding (mattress, pillows, bed sheets)`
32	`Fitness equipment (dumbbells, yoga mat, treadmill)`
33	`Home improvement tools (drill, ladder, power saw)`
34	`Pet products (dog bed, crate, automatic feeder)`
35	`Furniture (desk, chair, bookshelf)`
36	`Fashion/clothing (jacket, jeans, handbag)`
37	`Grocery/food delivery (Instacart order, meal kit, restaurant delivery)`
38	`Outdoor/camping gear (tent, sleeping bag, cooler)`
39	`Automotive accessories (car seat cover, dash cam, phone mount)`
40	`Toys/games (LEGO set, board game, remote-control car)`
41	`Other` (option-label)
42	`Other` (free-text)

Question stem. "Which of the following have you purchased for yourself or your household in the past 12 months? (Select all that apply)"

Universe. All respondents who passed screening (Q1 + Q2).

E. Recent ≥$50 Purchase Context (Q4–Q6)

#	Column	Type	Response Set
43	`Think about the last time you bought a product online that cost $50 or more. From which category was that purchase?`	Single-select	One of the 18 categories from D (matrix anchor).
44	`For that purchase, how confident were you that you found the best product for your needs before completing the purchase?`	7-point Likert	`Not at all confident` (1), `2`, `3`, `Neutral` (4), `5`, `6`, `Extremely confident` (7)
45	`About how much time did you spend researching that purchase before you bought it?`	Single-select	`Less than 5 minutes`, `5–15 minutes`, `15–30 minutes`, `30–60 minutes`, `More than 1 hour`, `I did not do research`

F. Research Tools Used for That Purchase (Q7)

Multi-select. Cell value is the option label when selected, blank when not.

#	Column header (abbreviated)
46	`I did not do research`
47	`AI assistant (such as ChatGPT, Gemini, Claude, Perplexity)`
48	`Online customer reviews`
49	`Reddit or online forums`
50	`YouTube videos or reviewers`
51	`Friends or family`
52	`Expert publications or professionals`
53	`Brand or retailer website`
54	`Other` (option-label)
55	`Other` (free-text)

Question stem. "Which of the following did you use to research that purchase? (Select all that apply)"

G. Post-Purchase Feelings (Q8)

Multi-select. Cell value is the option label when selected, blank when not.

#	Column header (abbreviated)
56	`I felt confident I made the right choice`
57	`I wished I had researched more`
58	`I found a better option afterward`
59	`I returned the product`
60	`None of the above`

Question stem. "After completing that purchase, which of the following best describes how you felt? (Select all that apply)"

H. Purchase Research Confidence Index — PRCI (Q9)

18-category Likert matrix. Each column is one category; cell value is one of the 7-point scale labels.

Question stem. "In general, how confident are you in your ability to find the best product when shopping online in each of the following categories?"

Scale labels. Not at all confident (1), 2, 3, Neutral (4), 5, 6, Extremely confident (7).

#	Category	#	Category
61	Apparel	70	Fitness equipment
62	Electronics	71	Home improvement tools
63	Video games	72	Pet products
64	Skincare	73	Furniture
65	Beauty	74	Fashion/clothing
66	Baby products/child safety	75	Grocery/food delivery
67	Supplements/health products	76	Outdoor/camping gear
68	Kitchen appliances	77	Automotive accessories
69	Mattress/bedding	78	Toys/games

I. Retailer-Trust Statement (Q10)

#	Column	Type	Response Set
79	`'Online retailers always have my best interest as a customer in mind.'`	5-point agree-disagree	`Strongly disagree`, `Disagree`, `Neither agree nor disagree`, `Agree`, `Strongly agree`

Question stem. "How much do you agree or disagree with the following statement: …"

Used to compute net retailer trust (agree+strongly agree minus disagree+strongly disagree).

J. Information Trust Hierarchy — IH (Q11)

7-source Likert matrix. Each column is one information source; cell value is one of the 7-point scale labels.

Question stem. "When researching a product you may buy online, how much do you trust information from each of the following sources?"

Scale labels. Do not trust at all (1), 2, 3, Neutral (4), 5, 6, Trust completely (7).

#	Source
80	AI tool / assistant
81	Online customer reviews
82	Reddit or online forums
83	YouTube creators or reviewers
84	Friends or family
85	Expert publications or professionals
86	Brand or retailer website

K. AI Usage Block (Q12–Q14)

#	Column	Type	Response Set	Universe
87	`Have you used an AI assistant to help research a product you were considering buying in the past 90 days?`	Single-select	`Yes`, `No`	All completes (n=1,463)
88	`How satisfied were you with the AI assistant's help?`	7-point Likert	`Not at all satisfied` (1) … `Extremely satisfied` (7)	AI users (Q12 = Yes; n=623)
89	`Did you verify the AI assistant's recommendation through another source before making a purchase?`	Single-select	`Yes - always`, `Yes - sometimes`, `No`	AI users (Q12 = Yes; n=623)

This block produces two of the headline statistics in the report:

43% AI usage in past 90 days (Q12 = Yes ÷ full sample) — anchor: #stat-ai-usage-90-day
86% verification rate (Q14 ∈ {Yes-always, Yes-sometimes} ÷ AI-user subsample) — anchor: #stat-ai-verification-rate

L. AI Trust Index — ATI (Q15)

18-category Likert matrix. Each column is one category; cell value is one of the 7-point scale labels.

Question stem. "How much do you trust AI-generated product recommendations in each of the following categories?"

Scale labels. Do not trust at all (1), 2, 3, Neutral (4), 5, 6, Trust completely (7).

#	Category	#	Category
90	Apparel	99	Fitness equipment
91	Electronics	100	Home improvement tools
92	Video games	101	Pet products
93	Skincare	102	Furniture
94	Beauty	103	Fashion/clothing
95	Baby products/child safety	104	Grocery/food delivery
96	Supplements/health products	105	Outdoor/camping gear
97	Kitchen appliances	106	Automotive accessories
98	Mattress/bedding	107	Toys/games

M. AI Autonomy Threshold — AT (Q16)

#	Column	Type	Response Set
108	`What is the most you would spend on a product recommended by an AI assistant without checking any other source first?`	Single-select	`$0 — I would always verify`, `Under $25`, `$25 to $99`, `$100 to $499`, `$500 or more`, `No limit — I trust AI recommendations`

Headline statistic anchor. #stat-ai-autonomy-threshold — 42% selected $0 — I would always verify.

N. Review Trust (Q17)

#	Column	Type	Response Set
109	`How confident are you that online product reviews accurately reflect product quality?`	7-point Likert	`Not at all confident` (1) … `Extremely confident` (7)

This block belongs to the SC (Savings Behavior) instrument block and powers SimplyCodes-side findings in the Checkout Gap industry report. It is documented here for codebook completeness; PAI-side Trust Report findings do not anchor on these variables.

O.1 Promo code search behavior (Q18)

#	Column	Type
110	`When you shop online, which of the following best describes how you typically look for promo codes or discounts?`	Single-select

O.2 Promo-source trust ranking (Q19) — 7-source ranking matrix

Question stem: "When you do search for a promo code, where do you trust finding one the most? Rank the following choices in terms of trust: 1st - most trust; 7th - least trust"

#	Source
111	Coupon websites
112	Browser extensions
113	Reddit or online forums
114	TikTok or social media
115	Brand newsletters or emails
116	Friends or family
117	Google search

O.3 Promo-code reliability sentiment (Q20)

#	Column	Type
118	`Compared with a year or two ago, do promo codes feel more reliable, less reliable, or about the same?`	Single-select

O.4 Savings-tools trust matrix (Q21) — 5-tool Likert matrix

Question stem: "How much do you trust each of the following savings tools to actually save you money?"

#	Tool
119	Browser extensions
120	Coupon websites
121	Cash-back apps
122	Brand loyalty programs
123	Social media deal accounts

O.5 Recent promo experience (Q22–Q23)

#	Column	Type
124	`In the past 60 days, did you try using a promo code online?`	Single-select
125	`Did the code work and did you receive the promised discount?`	Single-select

O.6 Promo-failure mode (Q24) — multi-select

Question stem: "Thinking about the most recent time a promo code did not work, what happened? (Select all that apply)"

#	Failure mode
126	The code was expired
127	The code had restrictions I did not know about
128	The code was only for new customers
129	The discount was smaller than advertised
130	The site showed an error message
131	The code applied, then disappeared
132	Other (option-label)
133	Other (free-text)

O.7 Post-failure behavior (Q25) — multi-select

Question stem: "When that code did not work, what did you do next? (Select all that apply)"

#	Action
134	Searched for another code
135	Bought anyway at full price
136	Left and looked at another retailer
137	Abandoned the purchase
138	Came back later to try again
139	Contacted customer service
140	Other (option-label)
141	Other (free-text)

O.8 Promo-experience consequences (Q26–Q32)

#	Column	Type
142	`After that experience, how did it affect your perception of the brand or retailer?`	Single-select
143	`When a promo code does not work, who do you think is most responsible?`	Single-select
144	`Do you believe some retailers intentionally make promo codes harder to use than they need to be?`	Single-select
145	`Have you ever bought from a different retailer specifically because you found a working promo code there instead?`	Single-select
146	`When a code does work, does the discount usually match what was advertised?`	Single-select
147	`Have you ever completed a purchase at full price and then discovered a working promo code afterward?`	Single-select
148	`Would you use a tool that verifies whether promo codes actually work before you try them?`	Single-select

P. Demographics (Q33–Q35)

#	Column	Type	Response Set
149	`What is your gender?`	Single-select	`Female`, `Male` (open-ended in instrument; the export shows only the two value labels that appeared in completes)
150	`What is your annual household income before taxes?`	Single-select	`Less than $30K`, `$30-$39K`, `$40-$59K`, `$60-$74K`, `$75-$100K`, `$100-$150K`, `$150,000 and over`
151	`What is your ethnicity?`	Single-select	`White (non-Hispanic)`, `Black or African American`, `Hispanic or Latino`, `Asian or Pacific Islander`, `Other`
152	`Other:What is your ethnicity?`	Free text	Open-ended write-in for respondents who selected `Other` on Q35

Q. Completion Flag

#	Column	Type	Notes
153	`Complete`	Boolean	Internal Alchemer disposition flag. Redundant with `Status` (col 4).

Derived Variables & Composite Scores

The report references several composite measures constructed from the raw variables above.

PRCI Composite (Purchase Research Confidence Index)

Definition. Mean of 18-category PRCI scale responses (cols 61–78). Per-respondent: convert each Likert label to 1–7, take simple average across the 18 categories. Per-category: take mean across all respondents in the relevant universe.

Anchors. #stat-prci-overall (full-sample average), #stat-prci-by-category-range (per-category min/max).

ATI Composite (AI Trust Index)

Definition. Mean of 18-category ATI scale responses (cols 90–107). Same construction as PRCI.

Anchors. #stat-ai-trust-by-category-range.

IH Ranking (Information Trust Hierarchy)

Definition. Per-source mean trust score across all completes (cols 80–86). Ranked descending.

Anchors. #stat-trust-hierarchy-ranking.

Net Retailer Trust

Definition. Percentage of completes who agreed or strongly agreed with the retailer-trust statement (col 79) MINUS percentage who disagreed or strongly disagreed. Neutrals excluded.

Anchors. #stat-net-retailer-trust.

AT Bands (AI Autonomy Threshold)

Definition. Categorical bands on col 108 response. Reported "wouldn't trust over $X" calculations sum bands above each threshold.

Anchors. #stat-ai-autonomy-threshold.

Verification Rate

Definition. Among AI users (col 87 = Yes; n=623), the percentage who answered Yes - always OR Yes - sometimes on col 89.

Anchors. #stat-ai-verification-rate — 86%.

Canonical Statistics → Variable Map

Every statistic anchored in the published report maps to a specific variable or composite. This table aligns the report's #stat-* anchors to the codebook columns that drove them.

Stat anchor	Variables	Denominator
`#stat-ai-usage-90-day`	col 87	Full sample (n=1,463)
`#stat-ai-verification-rate`	col 89	AI-user subsample (n=623)
`#stat-ai-autonomy-threshold`	col 108	Full sample (n=1,463)
`#stat-ai-satisfaction-mean`	col 88	AI-user subsample (n=623) — apply REMAP_1_7 (see Mixed-format encoding correction above)
`#stat-prci-overall`	col 44 (PC-1)	Full sample non-missing (n=1,453) — apply REMAP_0_10 (see Mixed-format encoding correction above)
`#stat-prci-by-category-range`	cols 61–78 (per-category)	Full sample (n=1,463)
`#stat-trust-hierarchy-ranking`	cols 80–86	Full sample (n=1,463)
`#stat-ai-trust-by-category-range`	cols 90–107 (per-category)	Full sample (n=1,463)
`#stat-net-retailer-trust`	col 79	Full sample (n=1,463)
`#stat-review-trust`	col 109 (AT-7)	Actual review users (n=728 selected "Online customer reviews" in PC-4) — apply REMAP_1_7

For the full Glass Box footnote format that anchors each stat back to its denominator, see §7 of the published report or the productai:glassBoxFootnoteFormat block in the methodology entity.

Release Notes

v1.0 (2026-06-23) — Initial publication alongside the Trust in AI Commerce Report v1.

Contact

Questions about specific variables, derivation rules, or replication: research@product.ai.

Codebook — Trust in AI Commerce Report