- Notifications
You must be signed in to change notification settings - Fork 11
Added support for SMP. Implementation of ARCONNECT. #75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Modules implemented: - Inter-core Interrupt Unit (ICI) - Interrupt Distribution Unit (IDU) (not fully tested) - Global Free-Running Counter (GFRC) (not tested)
Cannot compile it on Ubuntu 20.04:
For the record:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, quad-core Linux on HS3x/4x does boot and runs hackbench
, which is pretty cool itself!
$ ./build/qemu-system-arc -M virt -serial mon:stdio -display none -kernel vmlinux -cpu archs -smp 4 Linux version 5.17.13 (abrodkin@abrodkin-5550) (arc-buildroot-linux-gnu-gcc.br_real (Buildroot 2022.05-rc2-58-g5821e96bd3) 11.3.0, GNU ld (GNU Binutils) 2.38) #2 SMP PREEMPT Mon Jun 6 15:30:47 +04 2022 Memory @ 80000000 [512M] OF: fdt: Machine model: snps,zebu_hs-smp earlycon: uart8250 at MMIO32 0xf0000000 (options '115200n8') printk: bootconsole [uart8250] enabled Failed to get possible-cpus from dtb, pretending all 4 cpus exist archs-intc : 16 priority levels (default 1) FIRQ (not used) IDENTITY : ARCVER [0x54] ARCNUM [0x0] CHIPID [0xffff] processor [0] : HS38 R3.10a (ARCv2 ISA) Timers : Timer0 Timer1 ISA Extn : mpy[opt 7] MMU [v4] : 8k PAGE, 2M Super Page (not used) , swalk 2 lvl, JTLB 1024 (256x4), uDTLB 8, uITLB 4, PAE40 (not used) I-Cache : 64K, 4way/set, 64B Line, VIPT aliasing D-Cache : 64K, 2way/set, 64B Line, PIPT Peripherals : 0xc0000000 Vector Table : 0x80000000 Extn [SMP] : ARConnect (v0): 4 cores with IDU Zone ranges: Normal [mem 0x0000000080000000-0x000000009fffffff] Movable zone start for each node Early memory node ranges node 0: [mem 0x0000000080000000-0x000000009fffffff] Initmem setup node 0 [mem 0x0000000080000000-0x000000009fffffff] percpu: Embedded 6 pages/cpu s14848 r8192 d26112 u49152 pcpu-alloc: s14848 r8192 d26112 u49152 alloc=6*8192 pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 Built 1 zonelists, mobility grouping on. Total pages: 65248 Kernel command line: earlycon=uart8250,mmio32,0xf0000000,115200n8 console=ttyS0,115200n8 debug print-fatal-signals=1 Dentry cache hash table entries: 65536 (order: 5, 262144 bytes, linear) Inode-cache hash table entries: 32768 (order: 4, 131072 bytes, linear) mem auto-init: stack:off, heap alloc:off, heap free:off Memory: 513600K/524288K available (3636K kernel code, 595K rwdata, 776K rodata, 1560K init, 241K bss, 10688K reserved, 0K cma-reserved) rcu: Preemptible hierarchical RCU implementation. rcu: RCU event tracing is enabled. Trampoline variant of Tasks RCU enabled. rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies. NR_IRQS: 512 MCIP: IDU supports 4 common irqs Global-64-bit-Ctr clocksource not detected Failed to initialize '/gfrc': -6 Console: colour dummy device 80x25 sched_clock: 32 bits at 100 Hz, resolution 10000000ns, wraps every 21474836475000000ns Calibrating delay loop... 190.87 BogoMIPS (lpj=954368) pid_max: default: 32768 minimum: 301 Mount-cache hash table entries: 2048 (order: 0, 8192 bytes, linear) Mountpoint-cache hash table entries: 2048 (order: 0, 8192 bytes, linear) cblist_init_generic: Setting adjustable number of callback queues. cblist_init_generic: Setting shift to 2 and lim to 1. rcu: Hierarchical SRCU implementation. smp: Bringing up secondary CPUs ... Idle Task [1] (ptrval) Trying to bring up CPU1 ... archs-intc : 16 priority levels (default 1) FIRQ (not used) IDENTITY : ARCVER [0x54] ARCNUM [0x1] CHIPID [0xffff] processor [1] : HS38 R3.10a (ARCv2 ISA) Timers : Timer0 Timer1 ISA Extn : mpy[opt 7] MMU [v4] : 8k PAGE, 2M Super Page (not used) , swalk 2 lvl, JTLB 1024 (256x4), uDTLB 8, uITLB 4, PAE40 (not used) I-Cache : 64K, 4way/set, 64B Line, VIPT aliasing D-Cache : 64K, 2way/set, 64B Line, PIPT Peripherals : 0xc0000000 Vector Table : 0x80000000 Extn [SMP] : ARConnect (v0): 4 cores with IDU ## CPU1 LIVE ##: Executing Code... Idle Task [2] (ptrval) Trying to bring up CPU2 ... archs-intc : 16 priority levels (default 1) FIRQ (not used) IDENTITY : ARCVER [0x54] ARCNUM [0x2] CHIPID [0xffff] processor [2] : HS38 R3.10a (ARCv2 ISA) Timers : Timer0 Timer1 ISA Extn : mpy[opt 7] MMU [v4] : 8k PAGE, 2M Super Page (not used) , swalk 2 lvl, JTLB 1024 (256x4), uDTLB 8, uITLB 4, PAE40 (not used) I-Cache : 64K, 4way/set, 64B Line, VIPT aliasing D-Cache : 64K, 2way/set, 64B Line, PIPT Peripherals : 0xc0000000 Vector Table : 0x80000000 Extn [SMP] : ARConnect (v0): 4 cores with IDU ## CPU2 LIVE ##: Executing Code... Idle Task [3] (ptrval) Trying to bring up CPU3 ... archs-intc : 16 priority levels (default 1) FIRQ (not used) IDENTITY : ARCVER [0x54] ARCNUM [0x3] CHIPID [0xffff] processor [3] : HS38 R3.10a (ARCv2 ISA) Timers : Timer0 Timer1 ISA Extn : mpy[opt 7] MMU [v4] : 8k PAGE, 2M Super Page (not used) , swalk 2 lvl, JTLB 1024 (256x4), uDTLB 8, uITLB 4, PAE40 (not used) I-Cache : 64K, 4way/set, 64B Line, VIPT aliasing D-Cache : 64K, 2way/set, 64B Line, PIPT Peripherals : 0xc0000000 Vector Table : 0x80000000 Extn [SMP] : ARConnect (v0): 4 cores with IDU ## CPU3 LIVE ##: Executing Code... smp: Brought up 1 node, 4 CPUs devtmpfs: initialized clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns futex hash table entries: 1024 (order: 4, 131072 bytes, linear) NET: Registered PF_NETLINK/PF_ROUTE protocol family DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations NET: Registered PF_INET protocol family IP idents hash table entries: 8192 (order: 3, 65536 bytes, linear) tcp_listen_portaddr_hash hash table entries: 1024 (order: 0, 12288 bytes, linear) TCP established hash table entries: 4096 (order: 1, 16384 bytes, linear) TCP bind hash table entries: 4096 (order: 2, 32768 bytes, linear) TCP: Hash tables configured (established 4096 bind 4096) UDP hash table entries: 256 (order: 0, 8192 bytes, linear) UDP-Lite hash table entries: 256 (order: 0, 8192 bytes, linear) NET: Registered PF_UNIX/PF_LOCAL protocol family RPC: Registered named UNIX socket transport module. RPC: Registered udp transport module. RPC: Registered tcp transport module. RPC: Registered tcp NFSv4.1 backchannel transport module. arc-pct fpga:pct: use noncoherent DMA ops This core does not have performance counters! workingset: timestamp_bits=30 max_order=16 bucket_order=0 io scheduler mq-deadline registered io scheduler kyber registered simple-pm-bus fpga: use noncoherent DMA ops Serial: 8250/16550 driver, 1 ports, IRQ sharing disabled of_serial f0000000.serial: use noncoherent DMA ops printk: console [ttyS0] disabled f0000000.serial: ttyS0 at MMIO 0xf0000000 (irq = 1, base_baud = 3125000) is a 16550A printk: console [ttyS0] enabled printk: console [ttyS0] enabled printk: bootconsole [uart8250] disabled printk: bootconsole [uart8250] disabled NET: Registered PF_PACKET protocol family NET: Registered PF_KEY protocol family Freeing unused kernel image (initmem) memory: 1560K This architecture does not have kernel memory protection. Run /init as init process with arguments: /init with environment: HOME=/ TERM=linux Starting syslogd: OK Starting klogd: OK Running sysctl: OK Saving random seed: random: dd: uninitialized urandom read (32 bytes read) OK Starting network: OK Welcome to Buildroot buildroot login: random: crng init done buildroot login: root # hackbench Running in process mode with 10 groups using 40 file descriptors each (== 400 tasks) Each sender will pass 100 messages of 100 bytes Time: 5.000
Still, would be good to see:
|
This was only tested for arcv2. ARConnect might not even be enabled on v3. Just doesn't properly complain about using -smp option |
@cupertinomiranda AFAIK it should be exactly the same for ARCv3.
And that's what I have in the log:
|
OK, with that trivial fix: diff --git a/target/arc/regs-detail.def b/target/arc/regs-detail.def index d0ab800f30..3ce3bf1a67 100644 --- a/target/arc/regs-detail.def +++ b/target/arc/regs-detail.def @@ -406,9 +406,9 @@ DEF(0x545, ARC_OPCODE_ARC700, NONE, aux_cabac_misc2) /* ARConnect */ DEF (0xd0, ARC_OPCODE_ARCALL, NONE, mcip_bcr) -DEF(0x600, ARC_OPCODE_ARCV2, NONE, mcip_cmd) -DEF(0x601, ARC_OPCODE_ARCV2, NONE, mcip_wdata) -DEF(0x602, ARC_OPCODE_ARCV2, NONE, mcip_readback) +DEF(0x600, ARC_OPCODE_ARCALL, NONE, mcip_cmd) +DEF(0x601, ARC_OPCODE_ARCALL, NONE, mcip_wdata) +DEF(0x602, ARC_OPCODE_ARCALL, NONE, mcip_readback) DEF(0x700, ARC_OPCODE_ARCALL, NONE, smart_control) /* I have much more success, see:
So basically, it works exactly as on a UP HS5x - as it fails on user-space stuff. Great work, anyways! |
@cupertinomiranda FWIW rebased on top of today's |
Modules implemented: