You are on page 1of 34

Reusable and Scalable Verification Solutions

for Designing AI/ML SoCs


Arm TechCon 2019

Satya Acharya
October 2019
AI SoC Verification Challenges • Fast and scalable design
verification environment to
manage multiple level of
verification

Scalable, High Performance SoC using


• Early, automated and
Arm® CoreLink™ CMN-600 and Arm® Clusters
cycle-accurate functional
& performance validation
of SoC/Interconnects

• Dedicated testbenches
are effort-intensive
and error-prone

• Simulator engines for


faster simulation

• Smart & effective


debugger

© 2019 Synopsys, Inc. 3


SoC Verification Automation & Environment • Accelerated design with
Fully Automated DUT, Testbench, Functional & Performance Design & Validation Synopsys DW Cores & PHYs

• Automated SoC Assembly


Synopsys IIP with coreAssembler
core
3rd /User IP
Assembler • Accelerated SoC verification
with Synopsys VIP & XTORs

Synopsys • Automated Testbench


VIP & XTORS VC Auto
Testbench
Assembly with VC
AutoTestbench
SoC
Synopsys
Test Suites • Protocol Compliance
Verification with Synopsys
Test Suites

Test Profile
VC VIP Auto
Performance
SoC Testbench • Automated Performance
Stimulus with VC VIP
AutoPerformance

© 2019 Synopsys, Inc. 4


SoC Verification Automation & Environment
Fully Automated DUT, Testbench, Functional & Performance Design & Validation

• S/W to Silicon Verification


Synopsys IIP
with Virtualizer, VCS, ZeBu
core VC Execution Manager & HAPS
Assembler
3rd /User IP

Virtualizer • Comprehensive Debug,


Planning, Coverage,
Synopsys Power, & Performance
VC Auto Verdi
VIP & XTORS Analysis with Verdi
Testbench
VCS Debug
SoC
Planning & • SoC Regression
Synopsys
Test Suites Coverage Automation with VC
ZeBu Performance Execution Manager
Analysis

Test Profile
VC VIP Auto Verification
Performance Closure
SoC Testbench
HAPS

© 2019 Synopsys, Inc. 5


Automated Testbench Generation & Performance Analysis
VC AutoTestbench & VC VIP AutoPerformance

© 2019 Synopsys, Inc. 6


Pure SystemVerilog for Maximum Visibility, Control & Performance
Arm® AMBA® Solution: CHI, ACE, AXI4, AXI3, AHB, APB
• Protocol Support
– AXI3, AXI4/5, ACE4/5 and CHI (A/B/C/D*/E*)
– AHB, APB4, ATB & Stream

• Includes
– Master, Slave
– Interconnect
– Sequence Collection

• Features
– Verification plan
– Built-in coverage
– Test Suite (CHI, AMBA 3 & AMBA 4)
– Support for Synopsys Protocol Analyzer
Integrated with Protocol Analyzer
© 2019 Synopsys, Inc. 7
Automated Testbench Generation from IP to SoC
VC AutoTestbench

• Test environment generation


– DUT IP-XACT provided by SNPS DUT
coreAssembler, Arm® Socrates System IP-XACT
Builder™ and Arm® AMBA® Designer
OR
– KDB generated by Verdi
Verdi
KDB
VC Auto
Testbench
• Automated VIP selection based on
VIP
VIP IP-XACT
IP-XACT
– Both Synopsys VIP & user

© 2019 Synopsys, Inc. 8


Automated Testbench Generation from IP to SoC
VC AutoTestbench Virtual Sequencer Virtual Sequencer

System Monitor Scoreboard

VIP VIP VIP VIP


• Supports SV/UVM for DUT
IP-XACT VIP
Interconnect
– Interconnect verification IP VIP
OR
– IP verification Verdi
VIP VIP VIP VIP VIP

KDB Interconnect Testbench IP Testbench


– SoC verification VC Auto
Testbench
Virtual Sequencer Virtual Sequencer
VIP
• Both passive & active RTL IP-XACT Scoreboard Scoreboard

replacement
– Both interconnect & IP SoC SoC
verification from SoC

• Integrated into Verdi

SoC Testbench SoC Testbench

© 2019 Synopsys, Inc. 9


VC AutoTestbench
Easy 5 step flow

Step 5: One click to generate Testbench!

Step 2: Connect VIP


Step 1: Read in DUT IP-XACT
Step 4: Configure VIP

OR Step 1: Create Bus through KDB


Step 3: Connect Clk, Rst & AdHoc Signals

© 2019 Synopsys, Inc. 10


VC AutoTestbench
Complete SV/UVM Testbench Automation
• SV/UVM environment with
– VIP instantiated, configured & connected
– Virtual sequencer
– Configuration files
– Sanity test

• Verilog DUT wrapper with


– DUT wrapper
– Clock & Reset generators

• Makefiles & VCS simulator setup

• HTML documentation

• User extendable

© 2019 Synopsys, Inc. 11


Complete Interconnect Functional Verification
VC AutoTestbench + VC VIP for AMBA + AMBA VIP Test Suites
Interconnect Testbench
AMBA VIP
AMBA System Monitor
• AMBA VIP
ACE ACE ACE
– CHI, ACE, AXI, AHB, APB MSTR MSTR MSTR
support DUT
Interconnect
– System Monitor with coherent & IP-XACT
Subsystem
VC Auto
non coherent checks
Testbench
VIP
IP-XACT Cache Coherent Interconnect

• AMBA Test Suite AXI Interconnect

– Connectivity, Functional, Power


Management, Dynamic Reset, & AMBA
Test Suite
Performance Verification Tests
ACE AXI AXI AXI
SLV SLV SLV SLV

© 2019 Synopsys, Inc. 12


Complete AMBA Interconnect Functional Verification
No extra interconnect environment needed SoC Interconnect Testbench
AMBA VIP
Virtual Sequencer

• Use SoC environment for AMBA System Monitor


interconnect verification DUT ACE ACE ACE
IP-XACT MSTR MSTR MSTR
– No need for a 2nd environment
OR
Verdi SoC Arm Arm
KDB GPU
• VC AutoTestbench active VC Auto CPU CPU
Testbench
replacement mode
VIP Cache Coherent Interconnect
– Non-interconnect IP replaced IP-XACT
by VIP AXI Interconnect
MCTL

AMBA
• Passive-only monitoring Test Suite
CSI2 PCIe USB

also supported
ACE AXI AXI AXI
SLV SLV SLV SLV

© 2019 Synopsys, Inc. 13


AMBA System Monitor
Performs System Level checks across interconnect ports

System Env

CHI CHI CHI CHI


ACE-Lite ACE-Lite ACE-Lite ACE-Lite ACE-Lite
Agent Agent Agent Agent
Master Master Master Master Master

AMBA 5 CHI ACE-Lite ACE-Lite ACE-Lite ACE-Lite ACE-Lite


CoreLink CCN-xxx cache-coherent network
L3 Cache

CHI AXI4

CHI AXI4
Agent Slave System Monitor

© 2019 Synopsys, Inc. 14


Quickly Identify the Source of Bugs
Verdi Protocol Analyzer and Memory Protocol Analyzer
• Protocol-aware transactions & SmartLog
attributes

Waveform
• Synchronize time with waveform Viewer
viewer, SmartLog, etc.

Transactions
• Link transactions to signals (concurrent)
Selection
Testbench Attributes
Hierarchy
• Find bugs quicker with

• Gain insight into memory operations


Selected
Address
• Flexible interactive or post-
processing mode

© 2019 Synopsys, Inc. 15


Smart Debug using Protocol Analyzer
Coherent to Snoop Association error

Identifying overlapping transactions

© 2019 Synopsys, Inc. 16


Simplified Cache Debug
Multiple
Synchronized transaction & cache views Masters
Master 1 Master 2 Master 3 Attributes
Cacheline
Cache Cache Cache
State

Coherent Interconnect

Memory
Slave Selected Transaction
(Main instances
Memor
y)
Coherent
• Visualize each cache operation Transactions
Access
– Coherent/Snoop history
– Cache/Memory

• Detailed view of each operation

• Complete cache operation history for


selected address

• Synchronization with Protocol View


© 2019 Synopsys, Inc. 17
Automated Performance Verification
VC VIP AutoPerformance

© 2019 Synopsys, Inc. 18


Complete Interconnect Performance Verification
• User provides test profile
VC AutoTestbench + VC VIP AutoPerformance + VC VIP for AMBA – Defines test grouping,
sequencers, traffic profiles
& resources
– Based on Arm Adaptive
DUT
IP-XACT – Traffic Profile specification

OR
Verdi • VC VIP AutoPerformance
KDB generates AMBA traffic to
VC Auto
Testbench match profile
VIP Verdi
IP-XACT VCS Performance
Analyzer • AMBA VIP built-in traffic
rate adapter & arbiter drive
Test VC VIP Auto profile on DUT
Profile
Performance

• User analyzes performance


simulation results with Verdi
Performance Analyzer

© 2019 Synopsys, Inc. 19


Defining a Performance Test
VC VIP AutoPerformance
Test Profile
Each Group is
executed sequentially Group Group … Group

The Traffic Profiles


in a given Group are Group
executed concurrently Sequencer Sequencer
Traffic … Traffic … T … T
A Synchronization Profile Profile P P CHI/ACE Traffic
Specification Profile
coordinates the
Synchronization Specification
execution of Traffic
Profiles within a Group

Multiple Resource Sequencer Sequencer


Profiles can be TP … TP
associated with a TP TP
OR …
sequencer Resource
Resource Profile
ResourceProfile
Profile RP RP
© 2019 Synopsys, Inc. 20
• Pre-defined metrics
Analyze & Debug Performance Violations – Latencies, bandwidths,
Verdi Performance Analyzer for latency, bandwidth & throughput violations counts, etc.

• Supports user-defined
metrics
– Based on SQL query
statements

• Apply constraints to detect


violations

• GUI charts to visualize


distributions & violations

• Trace violations to the related


transactions

• Export performance reports


and charts

• Batch mode support to


regress multiple tests

© 2019 Synopsys, Inc. 21


Example
Am I meeting my interconnect & memory controller
performance goals?

© 2019 Synopsys, Inc. 22


DUT & Testbench
Interconnect & Memory Subsystem
• DUT Testbench
– CMN600 Virtual Sequencer
– Memory controller System Monitor

• VIP Traffic
Profile
VC VIP Auto AXI AXI AXI
Performance VIP VIP VIP
– AMBA System Environment
– Masters, Slaves & System Monitor
– LPDDR VIP CMN-600
Verdi
• Traffic profiles initiating traffic on CCI Performance VCS Memory
Analyzer
– Concurrent WRITES from three masters, Controller
followed by READS Subsystem

– All traffic is sent to a single slave


LPDDR
VIP
• Performance analyzed during simulation

• Analyzed bottlenecks in DUT

© 2019 Synopsys, Inc. 23


Traffic Profile Performance Test

<?xml version="1.0" encoding="utf-8"?>


cache_type range <traffic_profile
indicates device type xmlns="http://www.synopsys.com/vc_speedtest_axi_traffic_profile">
<master>
transactions <axi
total_num_bytes="4096"
xact_size = "64" Initiates device memory
xact_action = "LOAD"
xact_gen_type = "FIXED"
WRITE transactions to
xact_type = "READNOSNOOP" sequential address from
cache_gen_type = "RANDOM"
cache_type_min = "4'b0000"
32’h4000_0000
cache_type_max = "4'b0001"
prot_gen_type = "FIXED"
prot_type_fixed = "SECURE"
addr_gen_type = "SEQUENTIAL"
base_addr = "'h4000_0000"
addr_xrange = "'h4002_0000"
addr_twodim_stride = "'h400C_0000"
id_gen_type = "FIXED" addr_twodim_stride is not
… applicable since
/>
</master> addr_gen_type is sequential
</traffic_profile> user could also opt not to
specify it at all

© 2019 Synopsys, Inc. 24


<group name="memory writes">
<sequencer instance="uvm_test_top.env.amba_system_env_0.axi_system[0].master[0].sequencer"
name = "master_0_sequencer">
<traffic_profile path="mem_normal_writes_sequential_1_profile.xml"
name="mem_normal_writes_sequential_1_profile"/> Sequencers
</sequencer>
<sequencer instance="uvm_test_top.env.amba_system_env_0.axi_system[0].master[1].sequencer" for writes
Define name = "master_1_sequencer">
<traffic_profile path="mem_normal_writes_sequential_1_profile.xml"
memory name="mem_normal_writes_sequential_1_profile"/>
writes </sequencer>
<sequencer instance="uvm_test_top.env.amba_system_env_0.axi_system[0].master[2].sequencer"
name = "master_2_sequencer">
<traffic_profile path="mem_normal_writes_sequential_2_profile.xml"
name="mem_normal_writes_sequential_2_profile"/>
</sequencer>
<!-- Synchronization specification -->
<synchronization_specification>
<output_event name="processor_1_end_of_profile"
output_event_type="end_of_profile"
sequencer_name="master_2_sequencer" Synchronization
traffic_profile_name="mem_normal_writes_sequential_2_profile" />

<output_event name="processor_1_end_of_frame_size"
output_event_type="end_of_frame_size"
sequencer_name="master_2_sequencer"
traffic_profile_name="mem_normal_writes_sequential_2_profile"
frame_size="4096" />
</synchronization_specification>
</group>
<!-- Group: memory reads -->

</group>
</test_profile>

© 2019 Synopsys, Inc. 25


Test Profile CHI & AXI Control

<group name="memory writes">


<sequencer instance="uvm_test_top.env.amba_system_env_0.chi_system[0].rn[0].rn_xact_seqr" Sequencers
name = "rn_0_sequencer">
<traffic_profile path="mem_normal_writes_sequential_1_profile.xml" for writes
name="mem_normal_writes_sequential_1_profile"/>
Define </sequencer>
memory <sequencer instance="uvm_test_top.env.amba_system_env_0.chi_system[0].rn[1]. rn_xact_seqr"
name = "rn_1_sequencer">
writes <traffic_profile path="mem_normal_writes_sequential_2_profile.xml"
name="mem_normal_writes_sequential_2_profile"/>
</sequencer>
<sequencer instance="uvm_test_top.env.amba_system_env_0.axi_system[0].master[2].sequencer"
name = "master_2_sequencer">
<traffic_profile path="mem_normal_writes_sequential_2_profile.xml"
name="mem_normal_writes_sequential_2_profile"/>
</sequencer>
<!-- Synchronization specification --> Synchronization
<synchronization_specification>
<output_event name="processor_1_end_of_profile"
output_event_type="end_of_profile"
sequencer_name="rn_0_sequencer"
traffic_profile_name="mem_normal_writes_sequential_1_profile" />
</synchronization_specification> <!-- Group: memory reads -->

</group>
</test_profile>

© 2019 Synopsys, Inc. 26


UVM Test

class axi_concurrent_multi_port_vcap extends amba_base_test;


...
virtual function void start_of_simulation_phase(uvm_phase phase);
string method = "start_of_simulation_phase";
string traffic_profile_dir = "";
super.start_of_simulation_phase(phase); Path to our traffic profiles
Connect up `ifdef SVT_AMBA_VCAP_ENABLE
VC AutoPerformance traffic_profile_dir = `SVT_DATA_UTIL_ARG_TO_STRING(`TRAFFIC_PROFILE_PATH);
`svt_debug("start_of_simulation", {"profile dir: ", traffic_profile_dir});
vcap_dpi_wrapper = `SVT_AMBA_TS_SYSTEM_ENV_PATH.traffic_arbiter.get_vcap_dpi_wrapper();
if (!vcap_dpi_wrapper.analyze_test( {traffic_profile_dir,"/top_concurrent_multi_port_profile.xml"})) begin
`svt_fatal("start_of_simulation_phase", "Unable to analyze test");
end
‘endif
endfunction
...
};
The test profile we
want to run

© 2019 Synopsys, Inc. 27


Bandwidth Observed at a Single Master and Slave

env.amba_system_env_0.axi_system_0.master_0
Slave Write bandwidth is lower than expected

env.amba_system_env_0.axi_system_0.slave_0

© 2019 Synopsys, Inc. 28


Write Latency Observed at a Slave
This port of the Interconnect is connected to the Memory Controller

Violation
Slow Slave!

Max Latency as per


architecture : 40000

© 2019 Synopsys, Inc. 29


Why is it Slow?
Memory subsystem performance bottleneck

Reasons VIP Metrices to Identify the Bottlenecks


Performance related metrices
Low Bandwidth/Data transfer Rate • Data Transfer Rate
on the JEDEC interface • Command Bandwidth

Page Policy related metrices


Irregular Page Accesses • Number of page empty counts
• Number of page Hits/Miss counts

Command to command related timing metrices


Unoptimized spacing between the • Write/Read to all commands
Commands • Refresh/Self Refresh timings

Command count related metrices


Traffic Profile • Activate/Refresh/Refresh All command count
• Precharge/Precharge All command count

© 2019 Synopsys, Inc. 30


LPDDR Performance Analysis
Unoptimized Commands Gap Spacing

• Write to Write command gap


timing metric

• Constraint being set to identify


the occurrences where tccd
value >10
Write command
• The neon purple bars highlight not maintaining
the occurrences where tccd constraints
constraint is being violated

…This is our culprit Constraint


setting for
tccd values
© 2019 Synopsys, Inc. 31
Conclusion

© 2019 Synopsys, Inc. 32


Synopsys SoC Verification Automation Solutions

Reduce time-to-first-test Synopsys IIP

from weeks to hours with core


Assembler
VC AutoTestbench 3rd /User IP

Synopsys
VC Auto
Automated Performance VIP & XTORS
Testbench
Verdi
SoC
Verification with VC VIP Synopsys
Debug

AutoPerformance & Verdi Test Suites VCS Planning &


Coverage

Performance
Test Profile
VC VIP Auto Analysis
Performance
Effective and faster
SoC Testbench
simulation debug using
Verdi Protocol Analyzer
© 2019 Synopsys, Inc. 33
Synopsys VIP Resources

Quarterly Newsletter VIP Central Blog Industry Leading VIP


Get the latest information on Your go-to resource for bus, Our broad portfolio provides access
Verification IP and source code protocol, and memory to the industry's latest protocols and
Test Suites, including trending verification IP topics, from interfaces to verify SoC designs.
topics straight into your inbox automotive to storage to AMBA.
every quarter.

Visit synopsys.com/vip today!

© 2019 Synopsys, Inc. 34


Thank You

You might also like