ref: 97356acb50e212fcfb7c91715718ec70953f780c
parent: 478c70f6d2c1f83d4c4e82ced533e71d9e19ef32
author: Joel Fernandes <joelaf@google.com>
date: Sun Sep 13 18:05:47 EDT 2020
vp8: Remove sched_yield on POSIX systems libvpx does sched_yield() on Linux. This is highly frowned upon these days mainly because it is not needed and causes high scheduler overhead. It is not needed because the kernel will preempt the task while it is spinning which will imply a yield. On ChromeOS, not yielding has the following improvements: 1. power_VideoCall test as seen on perf profile: With yield: 9.40% [kernel] [k] __pi___clean_dcache_area_poc 7.32% [kernel] [k] _raw_spin_unlock_irq <-- kernel scheduler Without yield: 8.76% [kernel] [k] __pi___clean_dcache_area_poc 2.27% [kernel] [k] _raw_spin_unlock_irq <-- kernel scheduler As you can see, there is a 5% drop in the scheduler's CPU utilization. 2. power_VideoCall test results: There is a 3% improvement on max video FPS, from 30 to 31. This improvement is consistent. Also note that the sched_yield() manpage itself says it is intended only for RT tasks. From manpagE: "sched_yield() is intended for use with real-time scheduling policies (i.e., SCHED_FIFO or SCHED_RR) and very likely means your application design is broken." BUG=b/168205004 Change-Id: Idb84ab19e94f6d0c7f9e544e7a407c946d5ced5c Signed-off-by: Joel Fernandes <joelaf@google.com>
--- a/vp8/common/threading.h
+++ b/vp8/common/threading.h
@@ -171,17 +171,20 @@
#define sem_wait(sem) (semaphore_wait(*sem))
#define sem_post(sem) semaphore_signal(*sem)
#define sem_destroy(sem) semaphore_destroy(mach_task_self(), *sem)
-#define thread_sleep(nms)
-/* { struct timespec ts;ts.tv_sec=0; ts.tv_nsec =
- 1000*nms;nanosleep(&ts, NULL);} */
#else
#include <unistd.h>
#include <sched.h>
-#define thread_sleep(nms) sched_yield();
+#endif /* __APPLE__ */
+/* Not Windows. Assume pthreads */
+
+/* thread_sleep implementation: yield unless Linux/Unix. */
+#if defined(__unix__) || defined(__APPLE__)
+#define thread_sleep(nms)
/* {struct timespec ts;ts.tv_sec=0;
ts.tv_nsec = 1000*nms;nanosleep(&ts, NULL);} */
-#endif
-/* Not Windows. Assume pthreads */
+#else
+#define thread_sleep(nms) sched_yield();
+#endif /* __unix__ || __APPLE__ */
#endif