How to set up and run A/B tests in Aimfox to compare LinkedIn connection request messages, track acceptance rates per variant, and scale winning copy.
James Whitfield
Lead gen agency owner, 50+ campaigns/month · Updated June 23, 2026
Last updated: July 2026 · James Whitfield, Lead gen agency owner, 50+ campaigns/month
TL;DR — 5 things to know before reading
Most LinkedIn outreach campaigns run one message variant until the campaign ends, then move on. The problem is that you never know if a different note would have produced 50% more acceptances. A/B testing removes that uncertainty by systematically comparing variants against each other on the same audience.
Aimfox allows you to split your prospect list between message variants and track performance separately for each. The winner scales; the loser is retired. Over several testing cycles, your connection notes and follow-up messages improve to the point where they consistently outperform industry averages. This guide covers how to set up an A/B test in Aimfox, what to test, how to read the results correctly, and what to do with the results to maximize LinkedIn outreach performance at scale.
Acceptance rates on LinkedIn connection campaigns vary significantly based on how the note is written. A generic note to the same audience might achieve 15% acceptance while a specific, personalised variant achieves 35%. That difference compounds through the entire sequence:
| Metric | Generic note (15% acceptance) | Specific note (35% acceptance) |
|---|---|---|
| Requests sent | 500 | 500 |
| Connections accepted | 75 | 175 |
| Follow-up replies (at 10%) | 7 | 17 |
| Conversations generated | 7 | 17 |
Source: LinkedIn's official connection and outreach policy and Aimfox reviews on G2 — verified June 2026
A 20-percentage-point improvement in connection note acceptance more than doubles the number of conversations generated from the same prospect list. This is why testing connection notes is the highest-leverage place to start.
The case for systematic A/B testing becomes even stronger when you consider the alternative: optimizing by feel. Most outreach practitioners make message changes based on anecdotal impressions ("this note feels better") without the data to confirm whether the change actually improved results. After six months of anecdotal optimization, they have no reliable knowledge of what is driving their results. After six months of systematic A/B testing, they have a validated library of what works with their specific audience, tested against real performance data. The difference in performance is significant and cumulative.
Before running tests, understanding what makes a result valid prevents two common errors: declaring a winner too early and drawing false conclusions from invalid comparisons.
Statistical significance in the context of LinkedIn outreach:
Statistical significance is the concept that a result is unlikely to have occurred by chance. In LinkedIn A/B testing, you are comparing two acceptance rates and asking whether the difference between them reflects a real performance difference or just random variation in who happened to accept on a given day.
The practical guideline for LinkedIn outreach A/B testing: a difference of 5 percentage points or more that is consistent across the entire test period (not driven by a single day's results) on a sample of 100+ per variant is meaningful enough to act on. Below this threshold, the result is not reliable enough to declare a winner.
Why small samples are misleading:
Imagine Variant A sends 20 requests on day 1 and gets 6 acceptances (30%). Variant B sends 20 requests and gets 3 acceptances (15%). Is Variant A better? You cannot tell — 20 requests per variant is far too small a sample. The next day might reverse entirely. Only after 100+ sends per variant do the daily fluctuations average out enough to see a reliable signal.
The common mistake of evaluating mid-test:
Looking at A/B test results on day 3 of a 14-day test and making decisions based on what you see is a reliability trap. Early results are dominated by whoever happened to log into LinkedIn and check their connection requests in the first few days. This is a biased sample. Wait for the full test period before evaluating.
| Element | Examples |
|---|---|
| Connection note tone | Professional vs conversational |
| Opening line | Role-specific vs company-specific |
| Note length | 2 sentences vs 4 sentences |
| Call-to-action | Soft question vs direct ask |
| Follow-up step 1 | Value-focused vs curiosity-focused |
| Follow-up step 2 | Specific ask vs open-ended check-in |
Test one variable at a time. If you change both the tone and the length in the same test, you cannot isolate which change drove the result.
The optimal A/B testing sequence follows the order in which prospects experience the campaign. Improving connection note acceptance rate is the highest-leverage place to start because it expands the audience that sees all subsequent follow-up messages. Once acceptance rate is optimized, improvements to follow-up messages apply to a larger base.
Phase 1: Connection note optimization (Month 1–2)
Test the connection note first. Run 2–3 testing cycles focused exclusively on the note:
After 3 cycles, you have a validated connection note optimized for your ICP. Use this as the fixed baseline for all subsequent follow-up testing.
Phase 2: Follow-up message 1 optimization (Month 2–3)
With a fixed, optimized connection note in place, begin testing Follow-up Message 1. This is the first message sent after a connection is accepted. Test:
Phase 3: Full sequence optimization (Month 3+)
With connection note and Follow-up 1 optimized, test Follow-up Messages 2 and 3. At this point, you have a fully tested and validated outreach sequence that has been systematically improved from first touch to final follow-up.
This sequential approach means that by month 3, the campaign is significantly better than month 1 across every element, and the improvements are evidence-based rather than intuition-based.
In Aimfox, create two separate campaigns targeting the same audience type. Use the same LinkedIn search URL or a split of the same prospect list — ensure the two audiences are as similar as possible to isolate the message as the variable being tested.
Name the campaigns clearly: "Q3 SaaS Founders — Variant A" and "Q3 SaaS Founders — Variant B". This makes it easy to compare results in the Analytics section without confusion.
Create a tracking document for the test — even a basic spreadsheet — that records:
Without this documentation, it is easy to confuse which variant tested which element after running multiple cycles.
Write two distinct versions of the element you are testing. For a connection note test:
Variant A — Role-specific note: Reference the prospect's specific job title and what people in that role typically face. Keep it to 40–50 words.
Variant B — Company-specific note: Reference something specific about the prospect's company — industry position, recent news, or company stage. Keep it to 40–50 words.
Everything else should be identical between the two campaigns: the same LinkedIn account sending, the same daily limit, the same working hours, the same follow-up sequence.
Writing notes that actually differ:
A common mistake is writing two "different" variants that are actually just word-swapped versions of the same approach. "Hi [Name], I help VP Sales professionals at SaaS companies improve their outbound" and "Hello [Name], I work with VP Sales leaders in SaaS to improve outbound" are not meaningfully different. Aimfox's algorithm will produce nearly identical results on both because the prospect's response is driven by the same underlying message.
Write variants that represent genuinely different approaches: different persona framing, different problem references, different conversational register. If you cannot explain in a single sentence what is different about each variant and why you expect them to perform differently, the test will not generate useful data.
Divide your prospect source evenly between the two campaigns. If you have a LinkedIn search with 1,000 profiles, run 500 through Variant A and 500 through Variant B. Aimfox lets you set a prospect limit per campaign, so you can control this precisely.
Avoid testing on lists smaller than 200 total (100 per variant). Below that, random variation in who happens to see your note on a given day can produce misleading results.
How to split the list for comparability:
If using a LinkedIn search, do not simply send Variant A to the first 500 results and Variant B to the next 500 results. LinkedIn search results are ordered and the first 500 may be systematically different from the next 500 (newer connections, more active users, etc.). A cleaner approach is to use a third-party sourcing tool or Quarvio to get the prospect list first, then randomly split it before importing into Aimfox.
If you are sourcing directly from LinkedIn's search in Aimfox, accept that there may be some ordering bias and account for it by running the test long enough for both variants to reach the same random mix of the audience.
Configure both campaigns with identical parameters except for the message being tested:
Why identical parameters matter:
If Variant A runs Monday–Friday and Variant B runs Monday–Sunday, any difference in acceptance rate may be caused by the weekend sends rather than the message. If Variant A has a daily limit of 20 and Variant B has a limit of 30, Variant B may reach different prospect types due to the extended reach. Every parameter difference is a potential confounding variable that makes the test results uninterpretable.
Let both campaigns run simultaneously for at least 2 weeks before evaluating results. This ensures both variants have sent a comparable number of requests and reduces the influence of day-of-week variation (some days have higher response rates on LinkedIn than others).
Monitor but do not adjust the campaigns mid-test. Changing a variable mid-way invalidates the comparison.
What to monitor during the test (without acting on it):
Check the campaigns every 3–4 days to confirm both are running correctly: both are sending at the configured daily rate, no technical errors, both following the same schedule. If one campaign pauses due to a technical issue and the other continues, the test data is compromised for that period. Note any interruptions in the tracking document and adjust the planned test end date to compensate.
Do not check the performance metrics until the test period is complete. Checking intermediate results and then continuing creates a temptation to end the test early if one variant looks like a clear leader — but early-stage data is not reliable data, and ending the test early risks acting on a false lead.
After 2 weeks or once each variant has at least 100 requests sent, compare:
The variant with a statistically meaningful advantage on acceptance rate wins. "Meaningful" means a consistent difference of 5 percentage points or more across the full test period, not a single-day spike.
How to read Aimfox analytics:
In Aimfox's analytics dashboard, each campaign shows connection request count, acceptance count, acceptance rate, reply count, and reply rate. For an A/B test evaluation, open both campaign analytics views simultaneously and compare:
Handling inconclusive results:
If the two variants produce results within 3 percentage points of each other after a full test, the result is inconclusive. This does not mean the test failed — it means the two variants perform similarly for your audience. Retire both and write two new variants that differ more fundamentally. An inconclusive result is valuable information: it tells you that the specific variable you tested does not significantly affect acceptance rate for your audience, which narrows the search space for future tests.
Pause the losing campaign variant. Scale the winning variant to your full prospect list. Run it as your standard campaign until the next testing cycle.
After the winning variant has run for 3–4 weeks at full scale, design the next A/B test. Iterate one variable at a time. Over 3–4 testing cycles, your connection notes and follow-up messages will be significantly better-tuned to your specific audience than any generic template.
After optimizing connection notes and immediate follow-up messages, the most impactful variables to test are often structural rather than copy-based:
Sequence length testing:
Compare a 3-step sequence (connection note + 2 follow-ups) against a 4-step sequence (connection note + 3 follow-ups). Measure total conversations generated per 100 connection requests. This tells you whether the additional follow-up step is generating conversations or just increasing send volume without proportional return.
Sequence timing testing:
Compare a tight sequence (follow-up sent 2 days after connection) versus a spaced sequence (follow-up sent 5 days after connection). Acceptance rate will be the same since timing does not affect connection note performance, but reply rate to follow-up messages may differ based on when the prospect receives the follow-up relative to the recency of the connection.
Segmentation approach testing:
Compare a single campaign targeting a broad audience (VP Sales + Director of Sales + CRO) against three segmented campaigns, each with tailored messaging for the specific role. The segmented campaigns may produce higher acceptance and reply rates if the messaging is genuinely differentiated for each role, or they may perform similarly if the audience's concerns are homogeneous enough that role-specific targeting does not add value.
Channel combination testing:
For prospects who receive both Aimfox LinkedIn outreach and Instantly cold email, test whether LinkedIn-first or email-first produces better total conversion. Run LinkedIn outreach on Cohort A, email outreach on Cohort B, and LinkedIn + email on Cohort C. The multi-channel cohort typically outperforms either single-channel approach by 40–60% per Woodpecker's multichannel outreach research, but the testing confirms this for your specific audience.
LinkedIn A/B test data reveals more than just which LinkedIn message performs better — it tells you what messaging your ICP responds to. These insights transfer directly to cold email copy.
If Variant A's role-specific framing outperformed Variant B's company-specific framing in LinkedIn tests: Apply the role-specific framing to cold email personalization. Write email templates that reference what people in the prospect's role deal with rather than what you know about the prospect's company.
If the problem-first follow-up message outperformed the value-first message: Use problem-first framing in email body copy. Open cold emails with the problem the reader is experiencing rather than the outcome your solution delivers.
If shorter connection notes (40 words) outperformed longer ones (60 words): Apply brevity to cold email. If prospects are accepting shorter LinkedIn messages at higher rates, they likely prefer shorter emails as well. Test shorter email copy in parallel with your LinkedIn findings.
LinkedIn and cold email are different channels with different conventions, but they share the same audience (your ICP) and therefore share the same underlying preferences about how they like to receive outreach. LinkedIn A/B test data is free audience research that applies across channels.
"We A/B tested connection note variants in Aimfox over a 3-week period on the same target audience. Variant B — which referenced the prospect's company industry rather than their job title — produced a 31% acceptance rate versus 18% for Variant A. That improvement changed our entire campaign baseline." — G2 reviewer, Aimfox reviews on G2
Aimfox holds a 4.6/5 rating on G2. Woodpecker's multichannel outreach research shows that combining optimised LinkedIn messages with email outreach produces 40–60% higher reply rates than either channel alone — A/B testing ensures both channels are running optimised copy.
| Parameter | Required setting | Why |
|---|---|---|
| Daily connection request limit | Same for both campaigns | Different limits create different audience reach patterns |
| Working hours | Identical (e.g., Mon–Fri 9am–5pm) | Prevents day-of-week bias from different schedules |
| LinkedIn account | Same account for both campaigns | Different accounts have different network reach |
| Follow-up sequence | Identical (when testing connection note) | Isolates the connection note as the only variable |
| Connection note | Identical (when testing follow-up messages) | Isolates follow-up messages as the variable |
| Start date | Same day | Prevents temporal bias from different market conditions |
| Confidence level | Minimum sends per variant | Minimum difference to detect |
|---|---|---|
| Indicative (act with caution) | 100 per variant | 8+ percentage points |
| Reliable | 200 per variant | 5+ percentage points |
| High confidence | 500 per variant | 3+ percentage points |
| Very high confidence | 1,000+ per variant | 1–2 percentage points |
For most LinkedIn outreach A/B tests, the "Reliable" threshold (200 per variant, 5+ point difference) is the appropriate standard. Tests below this threshold should inform future testing rather than immediately change campaign strategy.
| Metric | When to track | What it tells you |
|---|---|---|
| Requests sent | Daily (monitoring only) | Confirms both variants are running at similar pace |
| Acceptance rate | After test completion | Primary metric for connection note tests |
| Reply rate | After test completion | Primary metric for follow-up message tests |
| Conversation rate | After test completion | Overall efficiency metric (conversations per 100 requests) |
| Pending acceptance | After test completion | How many requests are still awaiting response |
Pending acceptances — requests that have been sent but not yet accepted or declined — should be excluded from the acceptance rate calculation when comparing results, unless both variants have the same pending-to-completed ratio.
| Result pattern | Interpretation | Action |
|---|---|---|
| Variant A outperforms by 5+ points consistently | Clear winner | Scale Variant A immediately, retire B |
| Variant B outperforms by 5+ points consistently | Clear winner | Scale Variant B immediately, retire A |
| Within 3 points of each other | Inconclusive | Design a new test with more different variants |
| Early lead reversed by end of test | False early lead | Use end-of-test data only; early data was noise |
| One variant performed better on acceptance, other on reply rate | Split result | Keep connection note from acceptance winner; use follow-up from reply-rate winner |
Symptoms: After 10 days, Variant A has sent 180 requests and Variant B has sent only 45 requests. The acceptance rate comparison shows Variant A at 28% and Variant B at 31%, but the sample size for Variant B is too small to trust.
Cause: The most common cause is that Variant B's campaign has a lower daily limit configured, or Variant B's LinkedIn search returned fewer prospects than expected. If LinkedIn's search for Variant B's defined audience is smaller, the campaign exhausts the list faster and then idles.
Fix: Check Variant B's campaign configuration to confirm the daily limit matches Variant A. If the prospect pool was smaller than expected, add additional prospects to Variant B's campaign from the same audience type. Do not restart the test — just add prospects and allow Variant B to catch up. Extend the test period by however many days it takes Variant B to reach 100+ sends.
Symptoms: Both Variant A and Variant B are showing acceptance rates of 6–8%, well below the typical 15–35% range. The messages seem well-written.
Cause: The acceptance rate problem is most likely not the message — it is the audience quality. Connection acceptance rates this low usually indicate that a large portion of the prospect list has LinkedIn profiles that are inactive or not regularly monitored. Many LinkedIn accounts remain active in LinkedIn's search database but are checked infrequently by the user.
Fix: Check the profile quality of the prospect list used in the test. Look for: low profile completion scores, no recent activity (no recent posts or engagement), job titles that suggest the account is a placeholder rather than an active user. Replace low-quality profiles with higher-engagement contacts from Quarvio. Acceptance rates in the 15–35% range indicate an active, engaged prospect pool — if rates are consistently below 10% on tested messaging, the audience quality is the constraint, not the messaging.
Symptoms: Reviewing the weekly data in Aimfox analytics, Variant A outperformed Variant B in Week 1 (32% vs 19%), then Variant B outperformed in Week 2 (25% vs 38%), leaving no clear winner over the full two-week period.
Cause: This pattern typically reflects audience heterogeneity: the prospect list contains two distinct audience types that respond differently to each variant. The LinkedIn search may have returned a mix of company sizes, seniority levels, or industries that have different preferences, and each variant happened to reach a different mix in each week.
Fix: Segment the prospect list more narrowly before the next test. Split by company size (small vs mid-market), seniority level (VP vs Director), or industry (SaaS vs services). Run separate A/B tests for each segment. The audience-specific results will be more reliable than results across a heterogeneous mix.
Symptoms: Variant A has 32% acceptance rate. Variant B has 24% acceptance rate. But the prospects who accepted from Variant B are replying at 22%, while Variant A's accepted connections reply at only 9%.
Cause: Variant A's connection note is more accessible or appealing but may set lower expectations, attracting connections who are not genuinely interested in the topic. Variant B's connection note is more selective — it attracts fewer but more engaged connections, producing a higher reply rate among those who do accept.
Fix: Calculate the conversations per 100 requests for each variant. Variant A: 100 requests × 32% acceptance × 9% reply = 2.9 conversations. Variant B: 100 requests × 24% acceptance × 22% reply = 5.3 conversations. Variant B generates more conversations per request despite having a lower acceptance rate. In this scenario, the higher reply rate variant is the functional winner — use conversation rate (not acceptance rate alone) as the decision metric.
Symptoms: At 100 sends per variant, Variant A was winning 33% to 19%. At 250 sends, the gap has narrowed to 29% to 24%. At 400 sends, both are at around 27%.
Cause: The early test audience within the LinkedIn search may have been the most active and engaged segment of the total population — these prospects accepted at a higher rate because they are consistently active on LinkedIn. As the campaign reaches deeper into the search results, the audience quality declines and both variants' performance converges.
Fix: The correct interpretation here is that the campaign is reaching audience quality saturation, not that the message variants are equal. The quality of the remaining prospect pool is the binding constraint, not the messaging. Rather than continuing the test, use both variants at the current best performance level and add fresh, high-quality prospects from Quarvio to restart on a cleaner audience sample.
Symptoms: Aimfox paused Variant B's campaign for 3 days mid-test. During those 3 days, Variant A continued running. Now the sends per variant are 310 vs 145.
Cause: LinkedIn safety features or Aimfox's internal safety controls paused the campaign. This may be due to the LinkedIn account approaching the connection request limit for the current account warmup level, a login issue, or a LinkedIn security check.
Fix: Restart Variant B and extend the test period to allow it to catch up to Variant A's send count. Document the pause dates in the tracking document. Do not use data from the overlap period (when A was running and B was paused) in the final comparison. Evaluate results only from the period when both campaigns were running simultaneously.
Symptoms: After 5 A/B testing cycles over 4 months, the best acceptance rate achieved is 22%, and the most recent test cycles are producing variants within 2–3 points of each other.
Cause: The campaign has likely reached the performance ceiling for the current audience type and LinkedIn outreach format. The connection note and follow-up message are as optimized as testing can achieve with this audience.
Fix: The next performance improvement requires changes outside the message: audience expansion (target a related but different ICP segment), channel expansion (add Instantly cold email as a parallel channel to the same prospects), or offer development (change what the conversation is about rather than just how it is initiated). A/B testing optimizes the path to a conversation — if the ceiling has been reached, the next variable to test is the offer itself, not the mechanics of how it is delivered.
Symptoms: Aimfox shows 28% acceptance rate and 15% reply rate, which look like strong numbers. But over 8 weeks of running, only 2 meetings have been booked from LinkedIn outreach.
Cause: High reply rate with low meeting rate means the conversations are not converting. Most replies are likely "not interested" responses or neutral acknowledgments that do not progress toward a meeting. This is a different problem from what A/B testing of the message variants addresses — it is a conversation quality problem, not a message quality problem.
Fix: Review the content of the replies coming in through Aimfox Unibox. If replies are mostly declines, the ICP targeting may be off — the people accepting and replying are not the right buyers. If replies are neutral ("thanks, I'll keep it in mind"), the conversation is being ended too early without advancing to a meeting. The fix is likely in the Unibox conversation handling (how you respond to interested replies) or in the ICP targeting (using more selective audience criteria to focus on prospects with genuine buying intent). A/B testing the initial message will not solve a conversion problem at the conversation stage.
Instead of running tests reactively ("let's test a new message next month"), plan the full quarter's test sequence at the start. A quarterly plan for a mid-volume LinkedIn operation:
Month 1: Connection note test (role-specific vs problem-specific opening) Month 2: Follow-up message 1 test (value-first vs question-first) Month 3: Follow-up sequence length test (2-step vs 3-step) + timing test (2-day vs 5-day gap)
This approach ensures continuous optimization throughout the quarter without gaps between tests, and prevents optimization from stalling because no one planned the next test.
Most B2B outreach teams target multiple ICP segments simultaneously (e.g., VP Sales at SaaS companies AND VP Marketing at agencies). Running the same A/B test across both segments averages the results, which may mask that Variant A is much stronger for SaaS VP Sales while Variant B is stronger for Marketing at agencies.
Run separate A/B tests for each ICP segment with at least 100 prospects per variant per segment. Record the winning variant for each segment separately and configure Aimfox campaigns with segment-specific messages rather than a single winner applied to all.
When running A/B tests across multiple months, an external factor (LinkedIn algorithm change, industry news event, seasonal variation in LinkedIn activity) can affect acceptance rates independently of any message change. A control campaign — an always-running campaign using the best-performing message from all previous tests — provides a stable baseline for comparison.
If the control campaign's acceptance rate drops 8 points in January, and your current test variant also drops 8 points, the decline is likely seasonal. If the control holds steady but the new variant declines, the new variant is performing worse than the baseline — retire it.
When testing follow-up message timing in Aimfox (2 days vs 5 days after connection), the results reveal something about your ICP's "recency sensitivity" — whether they respond better when outreach is timely (2 days) or spaced out (5 days). This preference likely extends to cold email sequences as well.
If 2-day spacing outperforms 5-day in Aimfox follow-up testing, consider testing shorter sequence intervals in Instantly email sequences for the same ICP. Cross-channel preference alignment means the email sequence benefits from the LinkedIn A/B data without running separate email timing tests.
Most testing programs treat each test as independent. A better approach maintains a cumulative performance record that shows how each successive test moved the overall campaign baseline:
| Test cycle | Element tested | Variant A result | Variant B result | Winner | Baseline improvement |
|---|---|---|---|---|---|
| 1 | Note length | 18% acceptance | 26% acceptance | B (longer) | +8 points |
| 2 | Opening approach | 26% | 34% | B (problem-first) | +8 points |
| 3 | CTA | 34% | 37% | A (soft question) | +3 points |
| 4 | Follow-up timing | 37% | 40% | B (2-day gap) | +3 points |
This table shows cumulative progress: 18% acceptance at start to 40% acceptance after 4 test cycles. That compounded improvement represents the real ROI of systematic testing.
| Need | Tool | Notes |
|---|---|---|
| Verified B2B contacts | Quarvio | One-time purchase, no subscription |
| Email inboxes | Inframail | Microsoft 365 inboxes, auto DNS |
| Cold email sending | Instantly | Sequences, warm-up, reply tracking |
| LinkedIn outreach | Aimfox | Connection campaigns, Unibox |
How many prospects do I need to run a meaningful A/B test in Aimfox?
At minimum 100 prospects per variant — 200 total. Below 100 per variant, random variation in individual acceptance behaviour produces misleading results. 200–300 per variant gives clearer data. If your prospect list is smaller than 200 total, run the full list on one variant first, then test the next campaign cycle.
How long should I run an Aimfox A/B test before evaluating results?
At least 2 weeks, or until each variant has sent at least 100 connection requests. Running for less than 2 weeks introduces day-of-week bias (acceptance rates vary across the working week). Do not evaluate results mid-test based on early data — wait for the test to complete.
Can I test multiple elements at the same time in Aimfox?
You can run multiple simultaneous campaigns testing different elements, but you should not change two variables within a single campaign pair. Test the connection note in one A/B setup and the follow-up Step 1 message in a separate A/B setup on a different audience. Mixing multiple variables in one test makes it impossible to attribute the result to a specific change.
What is the most important element to A/B test first in a LinkedIn campaign?
Start with the connection note. It is the first thing the prospect sees, and its performance (acceptance rate) determines the size of the audience available for all subsequent follow-up steps. A 15-percentage-point improvement in acceptance rate on a 500-prospect list creates roughly 75 additional prospects for follow-up, which compounds through the entire sequence.
What is the minimum acceptable acceptance rate for a LinkedIn connection campaign?
For cold LinkedIn outreach to B2B prospects, an acceptance rate below 15% indicates a problem — either with the connection note or the prospect list quality. The average acceptance rate for optimised connection notes to well-targeted audiences is 25–35% per G2 reviews of Aimfox. If your rate is below 15%, run an A/B test comparing a fundamentally different note approach rather than adjusting the existing note.
How do I handle A/B testing when I have multiple LinkedIn accounts in Aimfox?
If Aimfox is managing multiple LinkedIn accounts, run both variants from the same account to prevent account-level performance differences from confounding the test. If you want to test account-level performance (one account vs another), run that as a separate test with identical messages on each account, not in combination with message testing.
Can I use A/B test data from one ICP segment to infer what will work for another?
With caution. If you find that problem-first opening in connection notes significantly outperforms role-specific opening for VP Sales at SaaS companies, this finding may transfer to VP Sales at enterprise software companies (similar role and concerns) but may not transfer to Head of Recruiting at those same companies (different role, different concerns). Use cross-segment inference as a hypothesis for the next test, not as a conclusion that skips the test.
Should I keep testing once I have a strong result (e.g., 35% acceptance rate)?
Yes, but switch from message optimization to structural testing. At 35% acceptance rate, you have likely optimized the message itself to a near-ceiling. The next performance gains come from sequence structure (timing, length), audience segmentation (narrower ICP targeting), and channel coordination (adding email outreach). Continue testing, but broaden the scope of what you test.
How does Aimfox Unibox integrate with A/B testing?
Aimfox Unibox aggregates all conversation replies from all campaigns into a single inbox. During an A/B test, replies from both Variant A and Variant B appear in Unibox. Tag incoming replies by the campaign variant they came from (either manually or using Aimfox labels) so you can compare not just acceptance rate but conversation quality across variants. Variant A may have a higher acceptance rate but produce more low-quality "not interested" replies; Variant B may produce fewer acceptances but more substantive conversations. Unibox data captures this quality dimension that the acceptance rate metric alone misses.
What should I do when a new test variant outperforms the previous champion?
When the new variant wins, update the champion in your tracking document and retire the previous champion from active campaign use. Begin designing the next test using the new champion as the baseline. Resist the temptation to run the previous champion alongside the new one "just to be sure" — this wastes prospect list and delays the next optimization cycle. Trust the test result and move to the next variable.
How do I know if my A/B testing program is generating compounding improvements?
Track your campaign baseline acceptance rate and conversation rate at the start of each month. If the metrics are not improving across testing cycles, review whether the tests are truly testing meaningfully different approaches or just variations within the same pattern. Compounding improvement requires each test cycle to find a genuine winner that outperforms the previous champion — if tests are consistently inconclusive, the variants are not different enough to generate data.
Can I use Aimfox A/B test results to inform LinkedIn Ads creative testing?
Yes. If a specific problem framing or audience persona outperforms others in connection note A/B tests, the same creative direction is worth testing in LinkedIn Sponsored Content. The A/B test data shows you what messaging resonates with your exact ICP on LinkedIn, which translates directly into ad creative strategy. The formats are different (single-person message vs paid ad) but the audience and messaging preferences are shared.
Test and optimise — then scale with verified contacts
A/B testing improves your message, but the right prospect list is what makes the result matter. Quarvio delivers pre-verified B2B contacts as a one-time purchase — no subscription, no recycled data — so your optimised Aimfox campaigns reach accurate decision-makers at scale.