summaryrefslogtreecommitdiff
path: root/SOURCES/AMD_CPPC.patch
diff options
context:
space:
mode:
Diffstat (limited to 'SOURCES/AMD_CPPC.patch')
-rw-r--r--SOURCES/AMD_CPPC.patch2923
1 files changed, 1144 insertions, 1779 deletions
diff --git a/SOURCES/AMD_CPPC.patch b/SOURCES/AMD_CPPC.patch
index fe4c48f..8a4b71e 100644
--- a/SOURCES/AMD_CPPC.patch
+++ b/SOURCES/AMD_CPPC.patch
@@ -1,18 +1,435 @@
-Add Collaborative Processor Performance Control feature flag for AMD
-processors.
-
-This feature flag will be used on the following amd-pstate driver. The
-amd-pstate driver has two approaches to implement the frequency control
-behavior. That depends on the CPU hardware implementation. One is "Full
-MSR Support" and another is "Shared Memory Support". The feature flag
-indicates the current processors with "Full MSR Support".
-
-Acked-by: Borislav Petkov <bp@suse.de>
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- arch/x86/include/asm/cpufeatures.h | 1 +
- 1 file changed, 1 insertion(+)
-
+diff --git a/Documentation/admin-guide/acpi/cppc_sysfs.rst b/Documentation/admin-guide/acpi/cppc_sysfs.rst
+index fccf22114e85..e53d76365aa7 100644
+--- a/Documentation/admin-guide/acpi/cppc_sysfs.rst
++++ b/Documentation/admin-guide/acpi/cppc_sysfs.rst
+@@ -4,6 +4,8 @@
+ Collaborative Processor Performance Control (CPPC)
+ ==================================================
+
++.. _cppc_sysfs:
++
+ CPPC
+ ====
+
+diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
+new file mode 100644
+index 000000000000..6bafb9354ba0
+--- /dev/null
++++ b/Documentation/admin-guide/pm/amd-pstate.rst
+@@ -0,0 +1,383 @@
++.. SPDX-License-Identifier: GPL-2.0
++.. include:: <isonum.txt>
++
++===============================================
++``amd-pstate`` CPU Performance Scaling Driver
++===============================================
++
++:Copyright: |copy| 2021 Advanced Micro Devices, Inc.
++
++:Author: Huang Rui <ray.huang@amd.com>
++
++
++Introduction
++===================
++
++``amd-pstate`` is the AMD CPU performance scaling driver that introduces a
++new CPU frequency control mechanism on modern AMD APU and CPU series in
++Linux kernel. The new mechanism is based on Collaborative Processor
++Performance Control (CPPC) which provides finer grain frequency management
++than legacy ACPI hardware P-States. Current AMD CPU/APU platforms are using
++the ACPI P-states driver to manage CPU frequency and clocks with switching
++only in 3 P-states. CPPC replaces the ACPI P-states controls, allows a
++flexible, low-latency interface for the Linux kernel to directly
++communicate the performance hints to hardware.
++
++``amd-pstate`` leverages the Linux kernel governors such as ``schedutil``,
++``ondemand``, etc. to manage the performance hints which are provided by
++CPPC hardware functionality that internally follows the hardware
++specification (for details refer to AMD64 Architecture Programmer's Manual
++Volume 2: System Programming [1]_). Currently ``amd-pstate`` supports basic
++frequency control function according to kernel governors on some of the
++Zen2 and Zen3 processors, and we will implement more AMD specific functions
++in future after we verify them on the hardware and SBIOS.
++
++
++AMD CPPC Overview
++=======================
++
++Collaborative Processor Performance Control (CPPC) interface enumerates a
++continuous, abstract, and unit-less performance value in a scale that is
++not tied to a specific performance state / frequency. This is an ACPI
++standard [2]_ which software can specify application performance goals and
++hints as a relative target to the infrastructure limits. AMD processors
++provides the low latency register model (MSR) instead of AML code
++interpreter for performance adjustments. ``amd-pstate`` will initialize a
++``struct cpufreq_driver`` instance ``amd_pstate_driver`` with the callbacks
++to manage each performance update behavior. ::
++
++ Highest Perf ------>+-----------------------+ +-----------------------+
++ | | | |
++ | | | |
++ | | Max Perf ---->| |
++ | | | |
++ | | | |
++ Nominal Perf ------>+-----------------------+ +-----------------------+
++ | | | |
++ | | | |
++ | | | |
++ | | | |
++ | | | |
++ | | | |
++ | | Desired Perf ---->| |
++ | | | |
++ | | | |
++ | | | |
++ | | | |
++ | | | |
++ | | | |
++ | | | |
++ | | | |
++ | | | |
++ Lowest non- | | | |
++ linear perf ------>+-----------------------+ +-----------------------+
++ | | | |
++ | | Lowest perf ---->| |
++ | | | |
++ Lowest perf ------>+-----------------------+ +-----------------------+
++ | | | |
++ | | | |
++ | | | |
++ 0 ------>+-----------------------+ +-----------------------+
++
++ AMD P-States Performance Scale
++
++
++.. _perf_cap:
++
++AMD CPPC Performance Capability
++--------------------------------
++
++Highest Performance (RO)
++.........................
++
++It is the absolute maximum performance an individual processor may reach,
++assuming ideal conditions. This performance level may not be sustainable
++for long durations and may only be achievable if other platform components
++are in a specific state; for example, it may require other processors be in
++an idle state. This would be equivalent to the highest frequencies
++supported by the processor.
++
++Nominal (Guaranteed) Performance (RO)
++......................................
++
++It is the maximum sustained performance level of the processor, assuming
++ideal operating conditions. In absence of an external constraint (power,
++thermal, etc.) this is the performance level the processor is expected to
++be able to maintain continuously. All cores/processors are expected to be
++able to sustain their nominal performance state simultaneously.
++
++Lowest non-linear Performance (RO)
++...................................
++
++It is the lowest performance level at which nonlinear power savings are
++achieved, for example, due to the combined effects of voltage and frequency
++scaling. Above this threshold, lower performance levels should be generally
++more energy efficient than higher performance levels. This register
++effectively conveys the most efficient performance level to ``amd-pstate``.
++
++Lowest Performance (RO)
++........................
++
++It is the absolute lowest performance level of the processor. Selecting a
++performance level lower than the lowest nonlinear performance level may
++cause an efficiency penalty but should reduce the instantaneous power
++consumption of the processor.
++
++AMD CPPC Performance Control
++------------------------------
++
++``amd-pstate`` passes performance goals through these registers. The
++register drives the behavior of the desired performance target.
++
++Minimum requested performance (RW)
++...................................
++
++``amd-pstate`` specifies the minimum allowed performance level.
++
++Maximum requested performance (RW)
++...................................
++
++``amd-pstate`` specifies a limit the maximum performance that is expected
++to be supplied by the hardware.
++
++Desired performance target (RW)
++...................................
++
++``amd-pstate`` specifies a desired target in the CPPC performance scale as
++a relative number. This can be expressed as percentage of nominal
++performance (infrastructure max). Below the nominal sustained performance
++level, desired performance expresses the average performance level of the
++processor subject to hardware. Above the nominal performance level,
++processor must provide at least nominal performance requested and go higher
++if current operating conditions allow.
++
++Energy Performance Preference (EPP) (RW)
++.........................................
++
++Provides a hint to the hardware if software wants to bias toward performance
++(0x0) or energy efficiency (0xff).
++
++
++Key Governors Support
++=======================
++
++``amd-pstate`` can be used with all the (generic) scaling governors listed
++by the ``scaling_available_governors`` policy attribute in ``sysfs``. Then,
++it is responsible for the configuration of policy objects corresponding to
++CPUs and provides the ``CPUFreq`` core (and the scaling governors attached
++to the policy objects) with accurate information on the maximum and minimum
++operating frequencies supported by the hardware. Users can check the
++``scaling_cur_freq`` information comes from the ``CPUFreq`` core.
++
++``amd-pstate`` mainly supports ``schedutil`` and ``ondemand`` for dynamic
++frequency control. It is to fine tune the processor configuration on
++``amd-pstate`` to the ``schedutil`` with CPU CFS scheduler. ``amd-pstate``
++registers adjust_perf callback to implement the CPPC similar performance
++update behavior. It is initialized by ``sugov_start`` and then populate the
++CPU's update_util_data pointer to assign ``sugov_update_single_perf`` as
++the utilization update callback function in CPU scheduler. CPU scheduler
++will call ``cpufreq_update_util`` and assign the target performance
++according to the ``struct sugov_cpu`` that utilization update belongs to.
++Then ``amd-pstate`` updates the desired performance according to the CPU
++scheduler assigned.
++
++
++Processor Support
++=======================
++
++The ``amd-pstate`` initialization will fail if the _CPC in ACPI SBIOS is
++not existed at the detected processor, and it uses ``acpi_cpc_valid`` to
++check the _CPC existence. All Zen based processors support legacy ACPI
++hardware P-States function, so while the ``amd-pstate`` fails to be
++initialized, the kernel will fall back to initialize ``acpi-cpufreq``
++driver.
++
++There are two types of hardware implementations for ``amd-pstate``: one is
++`Full MSR Support <perf_cap_>`_ and another is `Shared Memory Support
++<perf_cap_>`_. It can use :c:macro:`X86_FEATURE_CPPC` feature flag (for
++details refer to Processor Programming Reference (PPR) for AMD Family
++19h Model 51h, Revision A1 Processors [3]_) to indicate the different
++types. ``amd-pstate`` is to register different ``static_call`` instances
++for different hardware implementations.
++
++Currently, some of Zen2 and Zen3 processors support ``amd-pstate``. In the
++future, it will be supported on more and more AMD processors.
++
++Full MSR Support
++-----------------
++
++Some new Zen3 processors such as Cezanne provide the MSR registers directly
++while the :c:macro:`X86_FEATURE_CPPC` CPU feature flag is set.
++``amd-pstate`` can handle the MSR register to implement the fast switch
++function in ``CPUFreq`` that can shrink latency of frequency control on the
++interrupt context. The functions with ``pstate_xxx`` prefix represent the
++operations of MSR registers.
++
++Shared Memory Support
++----------------------
++
++If :c:macro:`X86_FEATURE_CPPC` CPU feature flag is not set, that means the
++processor supports shared memory solution. In this case, ``amd-pstate``
++uses the ``cppc_acpi`` helper methods to implement the callback functions
++that defined on ``static_call``. The functions with ``cppc_xxx`` prefix
++represent the operations of acpi cppc helpers for shared memory solution.
++
++
++AMD P-States and ACPI hardware P-States always can be supported in one
++processor. But AMD P-States has the higher priority and if it is enabled
++with :c:macro:`MSR_AMD_CPPC_ENABLE` or ``cppc_set_enable``, it will respond
++to the request from AMD P-States.
++
++
++User Space Interface in ``sysfs``
++==================================
++
++``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
++control its functionality at the system level. They located in the
++``/sys/devices/system/cpu/cpufreq/policyX/`` directory and affect all CPUs. ::
++
++ root@hr-test1:/home/ray# ls /sys/devices/system/cpu/cpufreq/policy0/*amd*
++ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_highest_perf
++ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_lowest_nonlinear_freq
++ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_max_freq
++
++
++``amd_pstate_highest_perf / amd_pstate_max_freq``
++
++Maximum CPPC performance and CPU frequency that the driver is allowed to
++set in percent of the maximum supported CPPC performance level (the highest
++performance supported in `AMD CPPC Performance Capability <perf_cap_>`_).
++In some of ASICs, the highest CPPC performance is not the one in the _CPC
++table, so we need to expose it to sysfs. If boost is not active but
++supported, this maximum frequency will be larger than the one in
++``cpuinfo``.
++This attribute is read-only.
++
++``amd_pstate_lowest_nonlinear_freq``
++
++The lowest non-linear CPPC CPU frequency that the driver is allowed to set
++in percent of the maximum supported CPPC performance level (Please see the
++lowest non-linear performance in `AMD CPPC Performance Capability
++<perf_cap_>`_).
++This attribute is read-only.
++
++For other performance and frequency values, we can read them back from
++``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.
++
++
++``amd-pstate`` vs ``acpi-cpufreq``
++======================================
++
++On majority of AMD platforms supported by ``acpi-cpufreq``, the ACPI tables
++provided by the platform firmware used for CPU performance scaling, but
++only provides 3 P-states on AMD processors.
++However, on modern AMD APU and CPU series, it provides the collaborative
++processor performance control according to ACPI protocol and customize this
++for AMD platforms. That is fine-grain and continuous frequency range
++instead of the legacy hardware P-states. ``amd-pstate`` is the kernel
++module which supports the new AMD P-States mechanism on most of future AMD
++platforms. The AMD P-States mechanism will be the more performance and energy
++efficiency frequency management method on AMD processors.
++
++Kernel Module Options for ``amd-pstate``
++=========================================
++
++``shared_mem``
++Use a module param (shared_mem) to enable related processors manually with
++**amd_pstate.shared_mem=1**.
++Due to the performance issue on the processors with `Shared Memory Support
++<perf_cap_>`_, so we disable it for the moment and will enable this by default
++once we address performance issue on this solution.
++
++The way to check whether current processor is `Full MSR Support <perf_cap_>`_
++or `Shared Memory Support <perf_cap_>`_ : ::
++
++ ray@hr-test1:~$ lscpu | grep cppc
++ Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
++
++If CPU Flags have cppc, then this processor supports `Full MSR Support
++<perf_cap_>`_. Otherwise it supports `Shared Memory Support <perf_cap_>`_.
++
++
++``cpupower`` tool support for ``amd-pstate``
++===============================================
++
++``amd-pstate`` is supported on ``cpupower`` tool that can be used to dump the frequency
++information. And it is in progress to support more and more operations for new
++``amd-pstate`` module with this tool. ::
++
++ root@hr-test1:/home/ray# cpupower frequency-info
++ analyzing CPU 0:
++ driver: amd-pstate
++ CPUs which run at the same hardware frequency: 0
++ CPUs which need to have their frequency coordinated by software: 0
++ maximum transition latency: 131 us
++ hardware limits: 400 MHz - 4.68 GHz
++ available cpufreq governors: ondemand conservative powersave userspace performance schedutil
++ current policy: frequency should be within 400 MHz and 4.68 GHz.
++ The governor "schedutil" may decide which speed to use
++ within this range.
++ current CPU frequency: Unable to call hardware
++ current CPU frequency: 4.02 GHz (asserted by call to kernel)
++ boost state support:
++ Supported: yes
++ Active: yes
++ AMD PSTATE Highest Performance: 166. Maximum Frequency: 4.68 GHz.
++ AMD PSTATE Nominal Performance: 117. Nominal Frequency: 3.30 GHz.
++ AMD PSTATE Lowest Non-linear Performance: 39. Lowest Non-linear Frequency: 1.10 GHz.
++ AMD PSTATE Lowest Performance: 15. Lowest Frequency: 400 MHz.
++
++
++Diagnostics and Tuning
++=======================
++
++Trace Events
++--------------
++
++There are two static trace events that can be used for ``amd-pstate``
++diagnostics. One of them is the cpu_frequency trace event generally used
++by ``CPUFreq``, and the other one is the ``amd_pstate_perf`` trace event
++specific to ``amd-pstate``. The following sequence of shell commands can
++be used to enable them and see their output (if the kernel is generally
++configured to support event tracing). ::
++
++ root@hr-test1:/home/ray# cd /sys/kernel/tracing/
++ root@hr-test1:/sys/kernel/tracing# echo 1 > events/amd_cpu/enable
++ root@hr-test1:/sys/kernel/tracing# cat trace
++ # tracer: nop
++ #
++ # entries-in-buffer/entries-written: 47827/42233061 #P:2
++ #
++ # _-----=> irqs-off
++ # / _----=> need-resched
++ # | / _---=> hardirq/softirq
++ # || / _--=> preempt-depth
++ # ||| / delay
++ # TASK-PID CPU# |||| TIMESTAMP FUNCTION
++ # | | | |||| | |
++ <idle>-0 [015] dN... 4995.979886: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=15 changed=false fast_switch=true
++ <idle>-0 [007] d.h.. 4995.979893: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=7 changed=false fast_switch=true
++ cat-2161 [000] d.... 4995.980841: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=0 changed=false fast_switch=true
++ sshd-2125 [004] d.s.. 4995.980968: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=4 changed=false fast_switch=true
++ <idle>-0 [007] d.s.. 4995.980968: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=7 changed=false fast_switch=true
++ <idle>-0 [003] d.s.. 4995.980971: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=3 changed=false fast_switch=true
++ <idle>-0 [011] d.s.. 4995.980996: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=11 changed=false fast_switch=true
++
++The cpu_frequency trace event will be triggered either by the ``schedutil`` scaling
++governor (for the policies it is attached to), or by the ``CPUFreq`` core (for the
++policies with other scaling governors).
++
++
++Reference
++===========
++
++.. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming,
++ https://www.amd.com/system/files/TechDocs/24593.pdf
++
++.. [2] Advanced Configuration and Power Interface Specification,
++ https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf
++
++.. [3] Processor Programming Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors
++ https://www.amd.com/system/files/TechDocs/56569-A1-PUB.zip
++
+diff --git a/Documentation/admin-guide/pm/working-state.rst b/Documentation/admin-guide/pm/working-state.rst
+index f40994c422dc..5d2757e2de65 100644
+--- a/Documentation/admin-guide/pm/working-state.rst
++++ b/Documentation/admin-guide/pm/working-state.rst
+@@ -11,6 +11,7 @@ Working-State Power Management
+ intel_idle
+ cpufreq
+ intel_pstate
++ amd-pstate
+ cpufreq_drivers
+ intel_epb
+ intel-speed-select
+diff --git a/MAINTAINERS b/MAINTAINERS
+index fe347675fb5c..8e0666a552df 100644
+--- a/MAINTAINERS
++++ b/MAINTAINERS
+@@ -975,6 +975,13 @@ S: Supported
+ T: git https://gitlab.freedesktop.org/agd5f/linux.git
+ F: drivers/gpu/drm/amd/pm/
+
++AMD PSTATE DRIVER
++M: Huang Rui <ray.huang@amd.com>
++L: linux-pm@vger.kernel.org
++S: Supported
++F: Documentation/admin-guide/pm/amd-pstate.rst
++F: drivers/cpufreq/amd-pstate*
++
+ AMD PTDMA DRIVER
+ M: Sanjay R Mehta <sanju.mehta@amd.com>
+ L: dmaengine@vger.kernel.org
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index d5b5f2ab87a0..18de5f76f198 100644
--- a/arch/x86/include/asm/cpufeatures.h
@@ -25,20 +442,8 @@ index d5b5f2ab87a0..18de5f76f198 100644
/* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
-
-
-
-AMD CPPC (Collaborative Processor Performance Control) function uses MSR
-registers to manage the performance hints. So add the MSR register macro
-here.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- arch/x86/include/asm/msr-index.h | 17 +++++++++++++++++
- 1 file changed, 17 insertions(+)
-
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
-index 01e2650b9585..e7945ef6a8df 100644
+index 01e2650b9585..3faf0f97edb1 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -486,6 +486,23 @@
@@ -52,56 +457,48 @@ index 01e2650b9585..e7945ef6a8df 100644
+#define MSR_AMD_CPPC_REQ 0xc00102b3
+#define MSR_AMD_CPPC_STATUS 0xc00102b4
+
-+#define CAP1_LOWEST_PERF(x) (((x) >> 0) & 0xff)
-+#define CAP1_LOWNONLIN_PERF(x) (((x) >> 8) & 0xff)
-+#define CAP1_NOMINAL_PERF(x) (((x) >> 16) & 0xff)
-+#define CAP1_HIGHEST_PERF(x) (((x) >> 24) & 0xff)
++#define AMD_CPPC_LOWEST_PERF(x) (((x) >> 0) & 0xff)
++#define AMD_CPPC_LOWNONLIN_PERF(x) (((x) >> 8) & 0xff)
++#define AMD_CPPC_NOMINAL_PERF(x) (((x) >> 16) & 0xff)
++#define AMD_CPPC_HIGHEST_PERF(x) (((x) >> 24) & 0xff)
+
-+#define REQ_MAX_PERF(x) (((x) & 0xff) << 0)
-+#define REQ_MIN_PERF(x) (((x) & 0xff) << 8)
-+#define REQ_DES_PERF(x) (((x) & 0xff) << 16)
-+#define REQ_ENERGY_PERF_PREF(x) (((x) & 0xff) << 24)
++#define AMD_CPPC_MAX_PERF(x) (((x) & 0xff) << 0)
++#define AMD_CPPC_MIN_PERF(x) (((x) & 0xff) << 8)
++#define AMD_CPPC_DES_PERF(x) (((x) & 0xff) << 16)
++#define AMD_CPPC_ENERGY_PERF_PREF(x) (((x) & 0xff) << 24)
+
/* Fam 17h MSRs */
#define MSR_F17H_IRPERF 0xc00000e9
-
-
-
-From: Steven Noonan <steven@valvesoftware.com>
-
-According to the ACPI v6.2 (and later) specification, SystemIO can be
-used for _CPC registers. This teaches cppc_acpi how to handle such
-registers.
-
-This patch was tested using the amd_pstate driver on my Zephyrus G15
-(model GA503QS) using the current version 410 BIOS, which uses
-a SystemIO register for the HighestPerformance element in _CPC.
-
-Signed-off-by: Steven Noonan <steven@valvesoftware.com>
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- drivers/acpi/cppc_acpi.c | 46 +++++++++++++++++++++++++++++++++++++---
- 1 file changed, 43 insertions(+), 3 deletions(-)
-
diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
-index a85c351589be..ca62c3dc9899 100644
+index a85c351589be..6c0a55a17dfc 100644
--- a/drivers/acpi/cppc_acpi.c
+++ b/drivers/acpi/cppc_acpi.c
-@@ -746,9 +746,24 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
+@@ -118,6 +118,8 @@ static DEFINE_PER_CPU(struct cpc_desc *, cpc_desc_ptr);
+ */
+ #define NUM_RETRIES 500ULL
+
++#define OVER_16BTS_MASK ~0xFFFFULL
++
+ #define define_one_cppc_ro(_name) \
+ static struct kobj_attribute _name = \
+ __ATTR(_name, 0444, show_##_name, NULL)
+@@ -746,9 +748,26 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
goto out_free;
cpc_ptr->cpc_regs[i-2].sys_mem_vaddr = addr;
}
+ } else if (gas_t->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
+ if (gas_t->access_width < 1 || gas_t->access_width > 3) {
-+ /* 1 = 8-bit, 2 = 16-bit, and 3 = 32-bit. SystemIO doesn't
-+ * implement 64-bit registers.
++ /*
++ * 1 = 8-bit, 2 = 16-bit, and 3 = 32-bit.
++ * SystemIO doesn't implement 64-bit
++ * registers.
+ */
+ pr_debug("Invalid access width %d for SystemIO register\n",
+ gas_t->access_width);
+ goto out_free;
+ }
-+ if (gas_t->address & ~0xFFFFULL) {
++ if (gas_t->address & OVER_16BTS_MASK) {
+ /* SystemIO registers use 16-bit integer addresses */
+ pr_debug("Invalid IO port %llu for SystemIO register\n",
+ gas_t->address);
@@ -114,7 +511,7 @@ index a85c351589be..ca62c3dc9899 100644
pr_debug("Unsupported register type: %d\n", gas_t->space_id);
goto out_free;
}
-@@ -923,7 +938,20 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
+@@ -923,7 +942,21 @@ static int cpc_read(int cpu, struct cpc_register_resource *reg_res, u64 *val)
}
*val = 0;
@@ -124,10 +521,11 @@ index a85c351589be..ca62c3dc9899 100644
+ u32 width = 8 << (reg->access_width - 1);
+ acpi_status status;
+
-+ status = acpi_os_read_port((acpi_io_address)reg->address, (u32 *)val, width);
-+
-+ if (status != AE_OK) {
-+ pr_debug("Error: Failed to read SystemIO port %llx\n", reg->address);
++ status = acpi_os_read_port((acpi_io_address)reg->address,
++ (u32 *)val, width);
++ if (ACPI_FAILURE(status)) {
++ pr_debug("Error: Failed to read SystemIO port %llx\n",
++ reg->address);
+ return -EFAULT;
+ }
+
@@ -136,7 +534,7 @@ index a85c351589be..ca62c3dc9899 100644
vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id);
else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
vaddr = reg_res->sys_mem_vaddr;
-@@ -962,7 +990,19 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
+@@ -962,7 +995,20 @@ static int cpc_write(int cpu, struct cpc_register_resource *reg_res, u64 val)
int pcc_ss_id = per_cpu(cpu_pcc_subspace_idx, cpu);
struct cpc_reg *reg = &reg_res->cpc_entry.reg;
@@ -145,10 +543,11 @@ index a85c351589be..ca62c3dc9899 100644
+ u32 width = 8 << (reg->access_width - 1);
+ acpi_status status;
+
-+ status = acpi_os_write_port((acpi_io_address)reg->address, (u32)val, width);
-+
-+ if (status != AE_OK) {
-+ pr_debug("Error: Failed to write SystemIO port %llx\n", reg->address);
++ status = acpi_os_write_port((acpi_io_address)reg->address,
++ (u32)val, width);
++ if (ACPI_FAILURE(status)) {
++ pr_debug("Error: Failed to write SystemIO port %llx\n",
++ reg->address);
+ return -EFAULT;
+ }
+
@@ -157,71 +556,7 @@ index a85c351589be..ca62c3dc9899 100644
vaddr = GET_PCC_VADDR(reg->address, pcc_ss_id);
else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
vaddr = reg_res->sys_mem_vaddr;
-
-
-
-From: Mario Limonciello <mario.limonciello@amd.com>
-
-As this is a static check, it should be based upon what is currently
-present on the system. This makes probeing more deterministic.
-
-While local APIC flags field (lapic_flags) of cpu core in MADT table is
-0, then the cpu core won't be enabled. In this case, _CPC won't be found
-in this core, and return back to _CPC invalid with walking through
-possible cpus (include disable cpus). This is not expected, so switch to
-check present CPUs instead.
-
-Reported-by: Jinzhou Su <Jinzhou.Su@amd.com>
-Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- drivers/acpi/cppc_acpi.c | 2 +-
- 1 file changed, 1 insertion(+), 1 deletion(-)
-
-diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
-index ca62c3dc9899..a46f227dc254 100644
---- a/drivers/acpi/cppc_acpi.c
-+++ b/drivers/acpi/cppc_acpi.c
-@@ -411,7 +411,7 @@ bool acpi_cpc_valid(void)
- struct cpc_desc *cpc_ptr;
- int cpu;
-
-- for_each_possible_cpu(cpu) {
-+ for_each_present_cpu(cpu) {
- cpc_ptr = per_cpu(cpc_desc_ptr, cpu);
- if (!cpc_ptr)
- return false;
-
-
-
-From: Jinzhou Su <Jinzhou.Su@amd.com>
-
-Add a new function to enable CPPC feature. This function
-will write Continuous Performance Control package
-EnableRegister field on the processor.
-
-CPPC EnableRegister register described in section 8.4.7.1 of ACPI 6.4:
-This element is optional. If supported, contains a resource descriptor
-with a single Register() descriptor that describes a register to which
-OSPM writes a One to enable CPPC on this processor. Before this register
-is set, the processor will be controlled by legacy mechanisms (ACPI
-Pstates, firmware, etc.).
-
-This register will be used for AMD processors to enable amd-pstate
-function instead of legacy ACPI P-States.
-
-Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- drivers/acpi/cppc_acpi.c | 45 ++++++++++++++++++++++++++++++++++++++++
- include/acpi/cppc_acpi.h | 5 +++++
- 2 files changed, 50 insertions(+)
-
-diff --git a/drivers/acpi/cppc_acpi.c b/drivers/acpi/cppc_acpi.c
-index a46f227dc254..003df9fba122 100644
---- a/drivers/acpi/cppc_acpi.c
-+++ b/drivers/acpi/cppc_acpi.c
-@@ -1262,6 +1262,51 @@ int cppc_get_perf_ctrs(int cpunum, struct cppc_perf_fb_ctrs *perf_fb_ctrs)
+@@ -1222,6 +1268,51 @@ int cppc_get_perf_ctrs(int cpunum, struct cppc_perf_fb_ctrs *perf_fb_ctrs)
}
EXPORT_SYMBOL_GPL(cppc_get_perf_ctrs);
@@ -273,150 +608,8 @@ index a46f227dc254..003df9fba122 100644
/**
* cppc_set_perf - Set a CPU's performance controls.
* @cpu: CPU for which to set performance controls.
-diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
-index bc159a9b4a73..92b7ea8d8f5e 100644
---- a/include/acpi/cppc_acpi.h
-+++ b/include/acpi/cppc_acpi.h
-@@ -138,6 +138,7 @@ extern int cppc_get_desired_perf(int cpunum, u64 *desired_perf);
- extern int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf);
- extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
- extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
-+extern int cppc_set_enable(int cpu, bool enable);
- extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
- extern bool acpi_cpc_valid(void);
- extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
-@@ -162,6 +163,10 @@ static inline int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls)
- {
- return -ENOTSUPP;
- }
-+static inline int cppc_set_enable(int cpu, bool enable)
-+{
-+ return -ENOTSUPP;
-+}
- static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
- {
- return -ENOTSUPP;
-
-
-
-amd-pstate is the AMD CPU performance scaling driver that introduces a
-new CPU frequency control mechanism on AMD Zen based CPU series in Linux
-kernel. The new mechanism is based on Collaborative processor
-performance control (CPPC) which is finer grain frequency management
-than legacy ACPI hardware P-States. Current AMD CPU platforms are using
-the ACPI P-states driver to manage CPU frequency and clocks with
-switching only in 3 P-states. AMD P-States is to replace the ACPI
-P-states controls, allows a flexible, low-latency interface for the
-Linux kernel to directly communicate the performance hints to hardware.
-
-"amd-pstate" leverages the Linux kernel governors such as *schedutil*,
-*ondemand*, etc. to manage the performance hints which are provided by CPPC
-hardware functionality. The first version for amd-pstate is to support one
-of the Zen3 processors, and we will support more in future after we verify
-the hardware and SBIOS functionalities.
-
-There are two types of hardware implementations for amd-pstate: one is full
-MSR support and another is shared memory support. It can use
-X86_FEATURE_CPPC feature flag to distinguish the different types.
-
-Using the new AMD P-States method + kernel governors (*schedutil*,
-*ondemand*, ...) to manage the frequency update is the most appropriate
-bridge between AMD Zen based hardware processor and Linux kernel, the
-processor is able to adjust to the most efficiency frequency according to
-the kernel scheduler loading.
-
-Performance Per Watt (PPW) Calculation:
-
-The PPW calculation is referred by below paper:
-https://software.intel.com/content/dam/develop/external/us/en/documents/performance-per-what-paper.pdf
-
-Below formula is referred from below spec to measure the PPW:
-
-(F / t) / P = F * t / (t * E) = F / E,
-
-"F" is the number of frames per second.
-"P" is power measured in watts.
-"E" is energy measured in joules.
-
-We use the RAPL interface with "perf" tool to get the energy data of the
-package power.
-
-The data comparisons between amd-pstate and acpi-freq module are tested on
-AMD Cezanne processor:
-
-1) TBench CPU benchmark:
-
-+---------------------------------------------------------------------+
-| |
-| TBench (Performance Per Watt) |
-| Higher is better |
-+-------------------+------------------------+------------------------+
-| | Performance Per Watt | Performance Per Watt |
-| Kernel Module | (Schedutil) | (Ondemand) |
-| | Unit: MB / (s * J) | Unit: MB / (s * J) |
-+-------------------+------------------------+------------------------+
-| | | |
-| acpi-cpufreq | 3.022 | 2.969 |
-| | | |
-+-------------------+------------------------+------------------------+
-| | | |
-| amd-pstate | 3.131 | 3.284 |
-| | | |
-+-------------------+------------------------+------------------------+
-
-2) Gitsource CPU benchmark:
-
-+---------------------------------------------------------------------+
-| |
-| Gitsource (Performance Per Watt) |
-| Higher is better |
-+-------------------+------------------------+------------------------+
-| | Performance Per Watt | Performance Per Watt |
-| Kernel Module | (Schedutil) | (Ondemand) |
-| | Unit: 1 / (s * J) | Unit: 1 / (s * J) |
-+-------------------+------------------------+------------------------+
-| | | |
-| acpi-cpufreq | 3.42172E-07 | 2.74508E-07 |
-| | | |
-+-------------------+------------------------+------------------------+
-| | | |
-| amd-pstate | 4.09141E-07 | 3.47610E-07 |
-| | | |
-+-------------------+------------------------+------------------------+
-
-3) Speedometer 2.0 CPU benchmark:
-
-+---------------------------------------------------------------------+
-| |
-| Speedometer 2.0 (Performance Per Watt) |
-| Higher is better |
-+-------------------+------------------------+------------------------+
-| | Performance Per Watt | Performance Per Watt |
-| Kernel Module | (Schedutil) | (Ondemand) |
-| | Unit: 1 / (s * J) | Unit: 1 / (s * J) |
-+-------------------+------------------------+------------------------+
-| | | |
-| acpi-cpufreq | 0.116111767 | 0.110321664 |
-| | | |
-+-------------------+------------------------+------------------------+
-| | | |
-| amd-pstate | 0.115825281 | 0.122024299 |
-| | | |
-+-------------------+------------------------+------------------------+
-
-According to above average data, we can see this solution has shown better
-performance per watt scaling on mobile CPU benchmarks in most of cases.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- drivers/cpufreq/Kconfig.x86 | 17 ++
- drivers/cpufreq/Makefile | 1 +
- drivers/cpufreq/amd-pstate.c | 398 +++++++++++++++++++++++++++++++++++
- 3 files changed, 416 insertions(+)
- create mode 100644 drivers/cpufreq/amd-pstate.c
-
diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86
-index 92701a18bdd9..21837eb1698b 100644
+index 92701a18bdd9..a951768c3ebb 100644
--- a/drivers/cpufreq/Kconfig.x86
+++ b/drivers/cpufreq/Kconfig.x86
@@ -34,6 +34,23 @@ config X86_PCC_CPUFREQ
@@ -432,8 +625,8 @@ index 92701a18bdd9..21837eb1698b 100644
+ help
+ This driver adds a CPUFreq driver which utilizes a fine grain
+ processor performance frequency control range instead of legacy
-+ performance levels. This driver supports the AMD processors with
-+ _CPC object in the SBIOS.
++ performance levels. _CPC needs to be present in the ACPI tables
++ of the system.
+
+ For details, take a look at:
+ <file:Documentation/admin-guide/pm/amd-pstate.rst>.
@@ -444,23 +637,125 @@ index 92701a18bdd9..21837eb1698b 100644
tristate "ACPI Processor P-States driver"
depends on ACPI_PROCESSOR
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
-index 48ee5859030c..c8d307010922 100644
+index 48ee5859030c..285de70af877 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
-@@ -25,6 +25,7 @@ obj-$(CONFIG_CPUFREQ_DT_PLATDEV) += cpufreq-dt-platdev.o
+@@ -17,6 +17,10 @@ obj-$(CONFIG_CPU_FREQ_GOV_ATTR_SET) += cpufreq_governor_attr_set.o
+ obj-$(CONFIG_CPUFREQ_DT) += cpufreq-dt.o
+ obj-$(CONFIG_CPUFREQ_DT_PLATDEV) += cpufreq-dt-platdev.o
+
++# Traces
++CFLAGS_amd-pstate-trace.o := -I$(src)
++amd_pstate-y := amd-pstate.o amd-pstate-trace.o
++
+ ##################################################################################
+ # x86 drivers.
+ # Link order matters. K8 is preferred to ACPI because of firmware bugs in early
+@@ -25,6 +29,7 @@ obj-$(CONFIG_CPUFREQ_DT_PLATDEV) += cpufreq-dt-platdev.o
# speedstep-* is preferred over p4-clockmod.
obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o
-+obj-$(CONFIG_X86_AMD_PSTATE) += amd-pstate.o
++obj-$(CONFIG_X86_AMD_PSTATE) += amd_pstate.o
obj-$(CONFIG_X86_POWERNOW_K8) += powernow-k8.o
obj-$(CONFIG_X86_PCC_CPUFREQ) += pcc-cpufreq.o
obj-$(CONFIG_X86_POWERNOW_K6) += powernow-k6.o
+diff --git a/drivers/cpufreq/amd-pstate-trace.c b/drivers/cpufreq/amd-pstate-trace.c
+new file mode 100644
+index 000000000000..891b696dcd69
+--- /dev/null
++++ b/drivers/cpufreq/amd-pstate-trace.c
+@@ -0,0 +1,2 @@
++#define CREATE_TRACE_POINTS
++#include "amd-pstate-trace.h"
+diff --git a/drivers/cpufreq/amd-pstate-trace.h b/drivers/cpufreq/amd-pstate-trace.h
+new file mode 100644
+index 000000000000..647505957d4f
+--- /dev/null
++++ b/drivers/cpufreq/amd-pstate-trace.h
+@@ -0,0 +1,77 @@
++/* SPDX-License-Identifier: GPL-2.0 */
++/*
++ * amd-pstate-trace.h - AMD Processor P-state Frequency Driver Tracer
++ *
++ * Copyright (C) 2021 Advanced Micro Devices, Inc. All Rights Reserved.
++ *
++ * Author: Huang Rui <ray.huang@amd.com>
++ */
++
++#if !defined(_AMD_PSTATE_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
++#define _AMD_PSTATE_TRACE_H
++
++#include <linux/cpufreq.h>
++#include <linux/tracepoint.h>
++#include <linux/trace_events.h>
++
++#undef TRACE_SYSTEM
++#define TRACE_SYSTEM amd_cpu
++
++#undef TRACE_INCLUDE_FILE
++#define TRACE_INCLUDE_FILE amd-pstate-trace
++
++#define TPS(x) tracepoint_string(x)
++
++TRACE_EVENT(amd_pstate_perf,
++
++ TP_PROTO(unsigned long min_perf,
++ unsigned long target_perf,
++ unsigned long capacity,
++ unsigned int cpu_id,
++ bool changed,
++ bool fast_switch
++ ),
++
++ TP_ARGS(min_perf,
++ target_perf,
++ capacity,
++ cpu_id,
++ changed,
++ fast_switch
++ ),
++
++ TP_STRUCT__entry(
++ __field(unsigned long, min_perf)
++ __field(unsigned long, target_perf)
++ __field(unsigned long, capacity)
++ __field(unsigned int, cpu_id)
++ __field(bool, changed)
++ __field(bool, fast_switch)
++ ),
++
++ TP_fast_assign(
++ __entry->min_perf = min_perf;
++ __entry->target_perf = target_perf;
++ __entry->capacity = capacity;
++ __entry->cpu_id = cpu_id;
++ __entry->changed = changed;
++ __entry->fast_switch = fast_switch;
++ ),
++
++ TP_printk("amd_min_perf=%lu amd_des_perf=%lu amd_max_perf=%lu cpu_id=%u changed=%s fast_switch=%s",
++ (unsigned long)__entry->min_perf,
++ (unsigned long)__entry->target_perf,
++ (unsigned long)__entry->capacity,
++ (unsigned int)__entry->cpu_id,
++ (__entry->changed) ? "true" : "false",
++ (__entry->fast_switch) ? "true" : "false"
++ )
++);
++
++#endif /* _AMD_PSTATE_TRACE_H */
++
++/* This part must be outside protection */
++#undef TRACE_INCLUDE_PATH
++#define TRACE_INCLUDE_PATH .
++
++#include <trace/define_trace.h>
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
new file mode 100644
-index 000000000000..8b501a72c3dd
+index 000000000000..40ceb031abf5
--- /dev/null
+++ b/drivers/cpufreq/amd-pstate.c
-@@ -0,0 +1,398 @@
+@@ -0,0 +1,643 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * amd-pstate.c - AMD Processor P-state Frequency Driver
@@ -468,6 +763,19 @@ index 000000000000..8b501a72c3dd
+ * Copyright (C) 2021 Advanced Micro Devices, Inc. All Rights Reserved.
+ *
+ * Author: Huang Rui <ray.huang@amd.com>
++ *
++ * AMD P-State introduces a new CPU performance scaling design for AMD
++ * processors using the ACPI Collaborative Performance and Power Control (CPPC)
++ * feature which works with the AMD SMU firmware providing a finer grained
++ * frequency control range. It is to replace the legacy ACPI P-States control,
++ * allows a flexible, low-latency interface for the Linux kernel to directly
++ * communicate the performance hints to hardware.
++ *
++ * AMD P-State is supported on recent AMD Zen base CPU series include some of
++ * Zen2 and Zen3 processors. _CPC needs to be present in the ACPI tables of AMD
++ * P-State supported system. And there are two types of hardware implementations
++ * for AMD P-State: 1) Full MSR Solution and 2) Shared Memory Solution.
++ * X86_FEATURE_CPPC CPU feature flag is used to distinguish the different types.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
@@ -494,17 +802,50 @@ index 000000000000..8b501a72c3dd
+#include <asm/processor.h>
+#include <asm/cpufeature.h>
+#include <asm/cpu_device_id.h>
++#include "amd-pstate-trace.h"
+
+#define AMD_PSTATE_TRANSITION_LATENCY 0x20000
+#define AMD_PSTATE_TRANSITION_DELAY 500
+
++/*
++ * TODO: We need more time to fine tune processors with shared memory solution
++ * with community together.
++ *
++ * There are some performance drops on the CPU benchmarks which reports from
++ * Suse. We are co-working with them to fine tune the shared memory solution. So
++ * we disable it by default to go acpi-cpufreq on these processors and add a
++ * module parameter to be able to enable it manually for debugging.
++ */
++static bool shared_mem = false;
++module_param(shared_mem, bool, 0444);
++MODULE_PARM_DESC(shared_mem,
++ "enable amd-pstate on processors with shared memory solution (false = disabled (default), true = enabled)");
++
+static struct cpufreq_driver amd_pstate_driver;
+
++/**
++ * struct amd_cpudata - private CPU data for AMD P-State
++ * @cpu: CPU number
++ * @cppc_req_cached: cached performance request hints
++ * @highest_perf: the maximum performance an individual processor may reach,
++ * assuming ideal conditions
++ * @nominal_perf: the maximum sustained performance level of the processor,
++ * assuming ideal operating conditions
++ * @lowest_nonlinear_perf: the lowest performance level at which nonlinear power
++ * savings are achieved
++ * @lowest_perf: the absolute lowest performance level of the processor
++ * @max_freq: the frequency that mapped to highest_perf
++ * @min_freq: the frequency that mapped to lowest_perf
++ * @nominal_freq: the frequency that mapped to nominal_perf
++ * @lowest_nonlinear_freq: the frequency that mapped to lowest_nonlinear_perf
++ *
++ * The amd_cpudata is key private data for each CPU thread in AMD P-State, and
++ * represents all the attributes and goals that AMD P-State requests at runtime.
++ */
+struct amd_cpudata {
+ int cpu;
+
-+ struct freq_qos_request req[2];
-+
++ struct freq_qos_request req[2];
+ u64 cppc_req_cached;
+
+ u32 highest_perf;
@@ -516,11 +857,26 @@ index 000000000000..8b501a72c3dd
+ u32 min_freq;
+ u32 nominal_freq;
+ u32 lowest_nonlinear_freq;
++
++ bool boost_supported;
+};
+
+static inline int pstate_enable(bool enable)
+{
-+ return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable ? 1 : 0);
++ return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
++}
++
++static int cppc_enable(bool enable)
++{
++ int cpu, ret = 0;
++
++ for_each_present_cpu(cpu) {
++ ret = cppc_set_enable(cpu, enable);
++ if (ret)
++ return ret;
++ }
++
++ return ret;
+}
+
+DEFINE_STATIC_CALL(amd_pstate_enable, pstate_enable);
@@ -546,9 +902,27 @@ index 000000000000..8b501a72c3dd
+ */
+ WRITE_ONCE(cpudata->highest_perf, amd_get_highest_perf());
+
-+ WRITE_ONCE(cpudata->nominal_perf, CAP1_NOMINAL_PERF(cap1));
-+ WRITE_ONCE(cpudata->lowest_nonlinear_perf, CAP1_LOWNONLIN_PERF(cap1));
-+ WRITE_ONCE(cpudata->lowest_perf, CAP1_LOWEST_PERF(cap1));
++ WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1));
++ WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1));
++ WRITE_ONCE(cpudata->lowest_perf, AMD_CPPC_LOWEST_PERF(cap1));
++
++ return 0;
++}
++
++static int cppc_init_perf(struct amd_cpudata *cpudata)
++{
++ struct cppc_perf_caps cppc_perf;
++
++ int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
++ if (ret)
++ return ret;
++
++ WRITE_ONCE(cpudata->highest_perf, amd_get_highest_perf());
++
++ WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
++ WRITE_ONCE(cpudata->lowest_nonlinear_perf,
++ cppc_perf.lowest_nonlinear_perf);
++ WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
+
+ return 0;
+}
@@ -570,6 +944,19 @@ index 000000000000..8b501a72c3dd
+ READ_ONCE(cpudata->cppc_req_cached));
+}
+
++static void cppc_update_perf(struct amd_cpudata *cpudata,
++ u32 min_perf, u32 des_perf,
++ u32 max_perf, bool fast_switch)
++{
++ struct cppc_perf_ctrls perf_ctrls;
++
++ perf_ctrls.max_perf = max_perf;
++ perf_ctrls.min_perf = min_perf;
++ perf_ctrls.desired_perf = des_perf;
++
++ cppc_set_perf(cpudata->cpu, &perf_ctrls);
++}
++
+DEFINE_STATIC_CALL(amd_pstate_update_perf, pstate_update_perf);
+
+static inline void amd_pstate_update_perf(struct amd_cpudata *cpudata,
@@ -586,14 +973,17 @@ index 000000000000..8b501a72c3dd
+ u64 prev = READ_ONCE(cpudata->cppc_req_cached);
+ u64 value = prev;
+
-+ value &= ~REQ_MIN_PERF(~0L);
-+ value |= REQ_MIN_PERF(min_perf);
++ value &= ~AMD_CPPC_MIN_PERF(~0L);
++ value |= AMD_CPPC_MIN_PERF(min_perf);
++
++ value &= ~AMD_CPPC_DES_PERF(~0L);
++ value |= AMD_CPPC_DES_PERF(des_perf);
+
-+ value &= ~REQ_DES_PERF(~0L);
-+ value |= REQ_DES_PERF(des_perf);
++ value &= ~AMD_CPPC_MAX_PERF(~0L);
++ value |= AMD_CPPC_MAX_PERF(max_perf);
+
-+ value &= ~REQ_MAX_PERF(~0L);
-+ value |= REQ_MAX_PERF(max_perf);
++ trace_amd_pstate_perf(min_perf, des_perf, max_perf,
++ cpudata->cpu, (value != prev), fast_switch);
+
+ if (value == prev)
+ return;
@@ -640,6 +1030,39 @@ index 000000000000..8b501a72c3dd
+ return 0;
+}
+
++static void amd_pstate_adjust_perf(unsigned int cpu,
++ unsigned long _min_perf,
++ unsigned long target_perf,
++ unsigned long capacity)
++{
++ unsigned long max_perf, min_perf, des_perf,
++ cap_perf, lowest_nonlinear_perf;
++ struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
++ struct amd_cpudata *cpudata = policy->driver_data;
++
++ cap_perf = READ_ONCE(cpudata->highest_perf);
++ lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf);
++
++ des_perf = cap_perf;
++ if (target_perf < capacity)
++ des_perf = DIV_ROUND_UP(cap_perf * target_perf, capacity);
++
++ min_perf = READ_ONCE(cpudata->highest_perf);
++ if (_min_perf < capacity)
++ min_perf = DIV_ROUND_UP(cap_perf * _min_perf, capacity);
++
++ if (min_perf < lowest_nonlinear_perf)
++ min_perf = lowest_nonlinear_perf;
++
++ max_perf = cap_perf;
++ if (max_perf < min_perf)
++ max_perf = min_perf;
++
++ des_perf = clamp_t(unsigned long, des_perf, min_perf, max_perf);
++
++ amd_pstate_update(cpudata, min_perf, des_perf, max_perf, true);
++}
++
+static int amd_get_min_freq(struct amd_cpudata *cpudata)
+{
+ struct cppc_perf_caps cppc_perf;
@@ -712,6 +1135,45 @@ index 000000000000..8b501a72c3dd
+ return lowest_nonlinear_freq * 1000;
+}
+
++static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
++{
++ struct amd_cpudata *cpudata = policy->driver_data;
++ int ret;
++
++ if (!cpudata->boost_supported) {
++ pr_err("Boost mode is not supported by this processor or SBIOS\n");
++ return -EINVAL;
++ }
++
++ if (state)
++ policy->cpuinfo.max_freq = cpudata->max_freq;
++ else
++ policy->cpuinfo.max_freq = cpudata->nominal_freq;
++
++ policy->max = policy->cpuinfo.max_freq;
++
++ ret = freq_qos_update_request(&cpudata->req[1],
++ policy->cpuinfo.max_freq);
++ if (ret < 0)
++ return ret;
++
++ return 0;
++}
++
++static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
++{
++ u32 highest_perf, nominal_perf;
++
++ highest_perf = READ_ONCE(cpudata->highest_perf);
++ nominal_perf = READ_ONCE(cpudata->nominal_perf);
++
++ if (highest_perf <= nominal_perf)
++ return;
++
++ cpudata->boost_supported = true;
++ amd_pstate_driver.boost_enabled = true;
++}
++
+static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
+{
+ int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@@ -756,6 +1218,9 @@ index 000000000000..8b501a72c3dd
+ /* It will be updated by governor */
+ policy->cur = policy->cpuinfo.min_freq;
+
++ if (boot_cpu_has(X86_FEATURE_CPPC))
++ policy->fast_switch_possible = true;
++
+ ret = freq_qos_add_request(&policy->constraints, &cpudata->req[0],
+ FREQ_QOS_MIN, policy->cpuinfo.min_freq);
+ if (ret < 0) {
@@ -778,6 +1243,8 @@ index 000000000000..8b501a72c3dd
+
+ policy->driver_data = cpudata;
+
++ amd_pstate_boost_init(cpudata);
++
+ return 0;
+
+free_cpudata2:
@@ -800,541 +1267,10 @@ index 000000000000..8b501a72c3dd
+ return 0;
+}
+
-+static struct cpufreq_driver amd_pstate_driver = {
-+ .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
-+ .verify = amd_pstate_verify,
-+ .target = amd_pstate_target,
-+ .init = amd_pstate_cpu_init,
-+ .exit = amd_pstate_cpu_exit,
-+ .name = "amd-pstate",
-+};
-+
-+static int __init amd_pstate_init(void)
-+{
-+ int ret;
-+
-+ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
-+ return -ENODEV;
-+
-+ if (!acpi_cpc_valid()) {
-+ pr_debug("the _CPC object is not present in SBIOS\n");
-+ return -ENODEV;
-+ }
-+
-+ /* don't keep reloading if cpufreq_driver exists */
-+ if (cpufreq_get_current_driver())
-+ return -EEXIST;
-+
-+ /* capability check */
-+ if (!boot_cpu_has(X86_FEATURE_CPPC)) {
-+ pr_debug("AMD CPPC MSR based functionality is not supported\n");
-+ return -ENODEV;
-+ }
-+
-+ /* enable amd pstate feature */
-+ ret = amd_pstate_enable(true);
-+ if (ret) {
-+ pr_err("failed to enable amd-pstate with return %d\n", ret);
-+ return ret;
-+ }
-+
-+ ret = cpufreq_register_driver(&amd_pstate_driver);
-+ if (ret)
-+ pr_err("failed to register amd_pstate_driver with return %d\n",
-+ ret);
-+
-+ return ret;
-+}
-+
-+static void __exit amd_pstate_exit(void)
-+{
-+ cpufreq_unregister_driver(&amd_pstate_driver);
-+
-+ amd_pstate_enable(false);
-+}
-+
-+module_init(amd_pstate_init);
-+module_exit(amd_pstate_exit);
-+
-+MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>");
-+MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver");
-+MODULE_LICENSE("GPL");
-
-
-
-Introduce the fast switch function for amd-pstate on the AMD processors
-which support the full MSR register control. It's able to decrease the
-latency on interrupt context.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- drivers/cpufreq/amd-pstate.c | 35 +++++++++++++++++++++++++++++++++++
- 1 file changed, 35 insertions(+)
-
-diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
-index 8b501a72c3dd..4a02a42f4113 100644
---- a/drivers/cpufreq/amd-pstate.c
-+++ b/drivers/cpufreq/amd-pstate.c
-@@ -177,6 +177,38 @@ static int amd_pstate_target(struct cpufreq_policy *policy,
- return 0;
- }
-
-+static void amd_pstate_adjust_perf(unsigned int cpu,
-+ unsigned long _min_perf,
-+ unsigned long target_perf,
-+ unsigned long capacity)
-+{
-+ unsigned long max_perf, min_perf, des_perf,
-+ cap_perf, lowest_nonlinear_perf;
-+ struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
-+ struct amd_cpudata *cpudata = policy->driver_data;
-+
-+ cap_perf = READ_ONCE(cpudata->highest_perf);
-+ lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf);
-+
-+ if (target_perf < capacity)
-+ des_perf = DIV_ROUND_UP(cap_perf * target_perf, capacity);
-+
-+ min_perf = READ_ONCE(cpudata->highest_perf);
-+ if (_min_perf < capacity)
-+ min_perf = DIV_ROUND_UP(cap_perf * _min_perf, capacity);
-+
-+ if (min_perf < lowest_nonlinear_perf)
-+ min_perf = lowest_nonlinear_perf;
-+
-+ max_perf = cap_perf;
-+ if (max_perf < min_perf)
-+ max_perf = min_perf;
-+
-+ des_perf = clamp_t(unsigned long, des_perf, min_perf, max_perf);
-+
-+ amd_pstate_update(cpudata, min_perf, des_perf, max_perf, true);
-+}
-+
- static int amd_get_min_freq(struct amd_cpudata *cpudata)
- {
- struct cppc_perf_caps cppc_perf;
-@@ -293,6 +325,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
- /* It will be updated by governor */
- policy->cur = policy->cpuinfo.min_freq;
-
-+ policy->fast_switch_possible = true;
-+
- ret = freq_qos_add_request(&policy->constraints, &cpudata->req[0],
- FREQ_QOS_MIN, policy->cpuinfo.min_freq);
- if (ret < 0) {
-@@ -341,6 +375,7 @@ static struct cpufreq_driver amd_pstate_driver = {
- .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
- .verify = amd_pstate_verify,
- .target = amd_pstate_target,
-+ .adjust_perf = amd_pstate_adjust_perf,
- .init = amd_pstate_cpu_init,
- .exit = amd_pstate_cpu_exit,
- .name = "amd-pstate",
-
-
-
-In some of Zen2 and Zen3 based processors, they are using the shared
-memory that exposed from ACPI SBIOS. In this kind of the processors,
-there is no MSR support, so we add acpi cppc function as the backend for
-them.
-
-It is using a module param (shared_mem) to enable related processors
-manually. We will enable this by default once we address performance
-issue on this solution.
-
-Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com>
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- drivers/cpufreq/amd-pstate.c | 71 ++++++++++++++++++++++++++++++++++--
- 1 file changed, 67 insertions(+), 4 deletions(-)
-
-diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
-index 4a02a42f4113..14a29326ceae 100644
---- a/drivers/cpufreq/amd-pstate.c
-+++ b/drivers/cpufreq/amd-pstate.c
-@@ -35,6 +35,19 @@
- #define AMD_PSTATE_TRANSITION_LATENCY 0x20000
- #define AMD_PSTATE_TRANSITION_DELAY 500
-
-+/* TODO: We need more time to fine tune processors with shared memory solution
-+ * with community together.
-+ *
-+ * There are some performance drops on the CPU benchmarks which reports from
-+ * Suse. We are co-working with them to fine tune the shared memory solution. So
-+ * we disable it by default to go acpi-cpufreq on these processors and add a
-+ * module parameter to be able to enable it manually for debugging.
-+ */
-+static bool shared_mem = false;
-+module_param(shared_mem, bool, 0444);
-+MODULE_PARM_DESC(shared_mem,
-+ "enable amd-pstate on processors with shared memory solution (false = disabled (default), true = enabled)");
-+
- static struct cpufreq_driver amd_pstate_driver;
-
- struct amd_cpudata {
-@@ -60,6 +73,19 @@ static inline int pstate_enable(bool enable)
- return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable ? 1 : 0);
- }
-
-+static int cppc_enable(bool enable)
-+{
-+ int cpu, ret = 0;
-+
-+ for_each_online_cpu(cpu) {
-+ ret = cppc_set_enable(cpu, enable ? 1 : 0);
-+ if (ret)
-+ return ret;
-+ }
-+
-+ return ret;
-+}
-+
- DEFINE_STATIC_CALL(amd_pstate_enable, pstate_enable);
-
- static inline int amd_pstate_enable(bool enable)
-@@ -90,6 +116,24 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
- return 0;
- }
-
-+static int cppc_init_perf(struct amd_cpudata *cpudata)
-+{
-+ struct cppc_perf_caps cppc_perf;
-+
-+ int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
-+ if (ret)
-+ return ret;
-+
-+ WRITE_ONCE(cpudata->highest_perf, amd_get_highest_perf());
-+
-+ WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf);
-+ WRITE_ONCE(cpudata->lowest_nonlinear_perf,
-+ cppc_perf.lowest_nonlinear_perf);
-+ WRITE_ONCE(cpudata->lowest_perf, cppc_perf.lowest_perf);
-+
-+ return 0;
-+}
-+
- DEFINE_STATIC_CALL(amd_pstate_init_perf, pstate_init_perf);
-
- static inline int amd_pstate_init_perf(struct amd_cpudata *cpudata)
-@@ -107,6 +151,19 @@ static void pstate_update_perf(struct amd_cpudata *cpudata, u32 min_perf,
- READ_ONCE(cpudata->cppc_req_cached));
- }
-
-+static void cppc_update_perf(struct amd_cpudata *cpudata,
-+ u32 min_perf, u32 des_perf,
-+ u32 max_perf, bool fast_switch)
-+{
-+ struct cppc_perf_ctrls perf_ctrls;
-+
-+ perf_ctrls.max_perf = max_perf;
-+ perf_ctrls.min_perf = min_perf;
-+ perf_ctrls.desired_perf = des_perf;
-+
-+ cppc_set_perf(cpudata->cpu, &perf_ctrls);
-+}
-+
- DEFINE_STATIC_CALL(amd_pstate_update_perf, pstate_update_perf);
-
- static inline void amd_pstate_update_perf(struct amd_cpudata *cpudata,
-@@ -325,7 +382,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
- /* It will be updated by governor */
- policy->cur = policy->cpuinfo.min_freq;
-
-- policy->fast_switch_possible = true;
-+ if (boot_cpu_has(X86_FEATURE_CPPC))
-+ policy->fast_switch_possible = true;
-
- ret = freq_qos_add_request(&policy->constraints, &cpudata->req[0],
- FREQ_QOS_MIN, policy->cpuinfo.min_freq);
-@@ -375,7 +433,6 @@ static struct cpufreq_driver amd_pstate_driver = {
- .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
- .verify = amd_pstate_verify,
- .target = amd_pstate_target,
-- .adjust_perf = amd_pstate_adjust_perf,
- .init = amd_pstate_cpu_init,
- .exit = amd_pstate_cpu_exit,
- .name = "amd-pstate",
-@@ -398,8 +455,14 @@ static int __init amd_pstate_init(void)
- return -EEXIST;
-
- /* capability check */
-- if (!boot_cpu_has(X86_FEATURE_CPPC)) {
-- pr_debug("AMD CPPC MSR based functionality is not supported\n");
-+ if (boot_cpu_has(X86_FEATURE_CPPC)) {
-+ pr_debug("AMD CPPC MSR based functionality is supported\n");
-+ amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
-+ } else if (shared_mem) {
-+ static_call_update(amd_pstate_enable, cppc_enable);
-+ static_call_update(amd_pstate_init_perf, cppc_init_perf);
-+ static_call_update(amd_pstate_update_perf, cppc_update_perf);
-+ } else {
- return -ENODEV;
- }
-
-
-
-
-Add trace event to monitor the performance value changes which is
-controlled by cpu governors.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- drivers/cpufreq/Makefile | 6 ++-
- drivers/cpufreq/amd-pstate-trace.c | 2 +
- drivers/cpufreq/amd-pstate-trace.h | 77 ++++++++++++++++++++++++++++++
- drivers/cpufreq/amd-pstate.c | 4 ++
- 4 files changed, 88 insertions(+), 1 deletion(-)
- create mode 100644 drivers/cpufreq/amd-pstate-trace.c
- create mode 100644 drivers/cpufreq/amd-pstate-trace.h
-
-diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
-index c8d307010922..285de70af877 100644
---- a/drivers/cpufreq/Makefile
-+++ b/drivers/cpufreq/Makefile
-@@ -17,6 +17,10 @@ obj-$(CONFIG_CPU_FREQ_GOV_ATTR_SET) += cpufreq_governor_attr_set.o
- obj-$(CONFIG_CPUFREQ_DT) += cpufreq-dt.o
- obj-$(CONFIG_CPUFREQ_DT_PLATDEV) += cpufreq-dt-platdev.o
-
-+# Traces
-+CFLAGS_amd-pstate-trace.o := -I$(src)
-+amd_pstate-y := amd-pstate.o amd-pstate-trace.o
-+
- ##################################################################################
- # x86 drivers.
- # Link order matters. K8 is preferred to ACPI because of firmware bugs in early
-@@ -25,7 +29,7 @@ obj-$(CONFIG_CPUFREQ_DT_PLATDEV) += cpufreq-dt-platdev.o
- # speedstep-* is preferred over p4-clockmod.
-
- obj-$(CONFIG_X86_ACPI_CPUFREQ) += acpi-cpufreq.o
--obj-$(CONFIG_X86_AMD_PSTATE) += amd-pstate.o
-+obj-$(CONFIG_X86_AMD_PSTATE) += amd_pstate.o
- obj-$(CONFIG_X86_POWERNOW_K8) += powernow-k8.o
- obj-$(CONFIG_X86_PCC_CPUFREQ) += pcc-cpufreq.o
- obj-$(CONFIG_X86_POWERNOW_K6) += powernow-k6.o
-diff --git a/drivers/cpufreq/amd-pstate-trace.c b/drivers/cpufreq/amd-pstate-trace.c
-new file mode 100644
-index 000000000000..891b696dcd69
---- /dev/null
-+++ b/drivers/cpufreq/amd-pstate-trace.c
-@@ -0,0 +1,2 @@
-+#define CREATE_TRACE_POINTS
-+#include "amd-pstate-trace.h"
-diff --git a/drivers/cpufreq/amd-pstate-trace.h b/drivers/cpufreq/amd-pstate-trace.h
-new file mode 100644
-index 000000000000..647505957d4f
---- /dev/null
-+++ b/drivers/cpufreq/amd-pstate-trace.h
-@@ -0,0 +1,77 @@
-+/* SPDX-License-Identifier: GPL-2.0 */
-+/*
-+ * amd-pstate-trace.h - AMD Processor P-state Frequency Driver Tracer
-+ *
-+ * Copyright (C) 2021 Advanced Micro Devices, Inc. All Rights Reserved.
-+ *
-+ * Author: Huang Rui <ray.huang@amd.com>
-+ */
-+
-+#if !defined(_AMD_PSTATE_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
-+#define _AMD_PSTATE_TRACE_H
-+
-+#include <linux/cpufreq.h>
-+#include <linux/tracepoint.h>
-+#include <linux/trace_events.h>
-+
-+#undef TRACE_SYSTEM
-+#define TRACE_SYSTEM amd_cpu
-+
-+#undef TRACE_INCLUDE_FILE
-+#define TRACE_INCLUDE_FILE amd-pstate-trace
-+
-+#define TPS(x) tracepoint_string(x)
-+
-+TRACE_EVENT(amd_pstate_perf,
-+
-+ TP_PROTO(unsigned long min_perf,
-+ unsigned long target_perf,
-+ unsigned long capacity,
-+ unsigned int cpu_id,
-+ bool changed,
-+ bool fast_switch
-+ ),
-+
-+ TP_ARGS(min_perf,
-+ target_perf,
-+ capacity,
-+ cpu_id,
-+ changed,
-+ fast_switch
-+ ),
-+
-+ TP_STRUCT__entry(
-+ __field(unsigned long, min_perf)
-+ __field(unsigned long, target_perf)
-+ __field(unsigned long, capacity)
-+ __field(unsigned int, cpu_id)
-+ __field(bool, changed)
-+ __field(bool, fast_switch)
-+ ),
-+
-+ TP_fast_assign(
-+ __entry->min_perf = min_perf;
-+ __entry->target_perf = target_perf;
-+ __entry->capacity = capacity;
-+ __entry->cpu_id = cpu_id;
-+ __entry->changed = changed;
-+ __entry->fast_switch = fast_switch;
-+ ),
-+
-+ TP_printk("amd_min_perf=%lu amd_des_perf=%lu amd_max_perf=%lu cpu_id=%u changed=%s fast_switch=%s",
-+ (unsigned long)__entry->min_perf,
-+ (unsigned long)__entry->target_perf,
-+ (unsigned long)__entry->capacity,
-+ (unsigned int)__entry->cpu_id,
-+ (__entry->changed) ? "true" : "false",
-+ (__entry->fast_switch) ? "true" : "false"
-+ )
-+);
-+
-+#endif /* _AMD_PSTATE_TRACE_H */
-+
-+/* This part must be outside protection */
-+#undef TRACE_INCLUDE_PATH
-+#define TRACE_INCLUDE_PATH .
-+
-+#include <trace/define_trace.h>
-diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
-index 14a29326ceae..5e080d0dc45f 100644
---- a/drivers/cpufreq/amd-pstate.c
-+++ b/drivers/cpufreq/amd-pstate.c
-@@ -31,6 +31,7 @@
- #include <asm/processor.h>
- #include <asm/cpufeature.h>
- #include <asm/cpu_device_id.h>
-+#include "amd-pstate-trace.h"
-
- #define AMD_PSTATE_TRANSITION_LATENCY 0x20000
- #define AMD_PSTATE_TRANSITION_DELAY 500
-@@ -189,6 +190,9 @@ static void amd_pstate_update(struct amd_cpudata *cpudata, u32 min_perf,
- value &= ~REQ_MAX_PERF(~0L);
- value |= REQ_MAX_PERF(max_perf);
-
-+ trace_amd_pstate_perf(min_perf, des_perf, max_perf,
-+ cpudata->cpu, (value != prev), fast_switch);
-+
- if (value == prev)
- return;
-
-
-
-
-If the sbios supports the boost mode of amd-pstate, let's switch to
-boost enabled by default.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- drivers/cpufreq/amd-pstate.c | 44 ++++++++++++++++++++++++++++++++++++
- 1 file changed, 44 insertions(+)
-
-diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
-index 5e080d0dc45f..0c335a917307 100644
---- a/drivers/cpufreq/amd-pstate.c
-+++ b/drivers/cpufreq/amd-pstate.c
-@@ -67,6 +67,8 @@ struct amd_cpudata {
- u32 min_freq;
- u32 nominal_freq;
- u32 lowest_nonlinear_freq;
-+
-+ bool boost_supported;
- };
-
- static inline int pstate_enable(bool enable)
-@@ -342,6 +344,45 @@ static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
- return lowest_nonlinear_freq * 1000;
- }
-
-+static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
-+{
-+ struct amd_cpudata *cpudata = policy->driver_data;
-+ int ret;
-+
-+ if (!cpudata->boost_supported) {
-+ pr_err("Boost mode is not supported by this processor or SBIOS\n");
-+ return -EINVAL;
-+ }
-+
-+ if (state)
-+ policy->cpuinfo.max_freq = cpudata->max_freq;
-+ else
-+ policy->cpuinfo.max_freq = cpudata->nominal_freq;
-+
-+ policy->max = policy->cpuinfo.max_freq;
-+
-+ ret = freq_qos_update_request(&cpudata->req[1],
-+ policy->cpuinfo.max_freq);
-+ if (ret < 0)
-+ return ret;
-+
-+ return 0;
-+}
-+
-+static void amd_pstate_boost_init(struct amd_cpudata *cpudata)
-+{
-+ u32 highest_perf, nominal_perf;
-+
-+ highest_perf = READ_ONCE(cpudata->highest_perf);
-+ nominal_perf = READ_ONCE(cpudata->nominal_perf);
-+
-+ if (highest_perf <= nominal_perf)
-+ return;
-+
-+ cpudata->boost_supported = true;
-+ amd_pstate_driver.boost_enabled = true;
-+}
-+
- static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
- {
- int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
-@@ -411,6 +452,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
-
- policy->driver_data = cpudata;
-
-+ amd_pstate_boost_init(cpudata);
-+
- return 0;
-
- free_cpudata2:
-@@ -439,6 +482,7 @@ static struct cpufreq_driver amd_pstate_driver = {
- .target = amd_pstate_target,
- .init = amd_pstate_cpu_init,
- .exit = amd_pstate_cpu_exit,
-+ .set_boost = amd_pstate_set_boost,
- .name = "amd-pstate",
- };
-
-
-
-
-Introduce sysfs attributes to get the different level processor
-frequencies.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- drivers/cpufreq/amd-pstate.c | 46 ++++++++++++++++++++++++++++++++++++
- 1 file changed, 46 insertions(+)
-
-diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
-index 0c335a917307..09c5fd8bd9da 100644
---- a/drivers/cpufreq/amd-pstate.c
-+++ b/drivers/cpufreq/amd-pstate.c
-@@ -476,6 +476,51 @@ static int amd_pstate_cpu_exit(struct cpufreq_policy *policy)
- return 0;
- }
-
+/* Sysfs attributes */
+
-+/* This frequency is to indicate the maximum hardware frequency.
++/*
++ * This frequency is to indicate the maximum hardware frequency.
+ * If boost is not active but supported, the frequency will be larger than the
+ * one in cpuinfo.
+ */
@@ -1368,46 +1304,8 @@ index 0c335a917307..09c5fd8bd9da 100644
+ return sprintf(&buf[0], "%u\n", freq);
+}
+
-+cpufreq_freq_attr_ro(amd_pstate_max_freq);
-+cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
-+
-+static struct freq_attr *amd_pstate_attr[] = {
-+ &amd_pstate_max_freq,
-+ &amd_pstate_lowest_nonlinear_freq,
-+ NULL,
-+};
-+
- static struct cpufreq_driver amd_pstate_driver = {
- .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
- .verify = amd_pstate_verify,
-@@ -484,6 +529,7 @@ static struct cpufreq_driver amd_pstate_driver = {
- .exit = amd_pstate_cpu_exit,
- .set_boost = amd_pstate_set_boost,
- .name = "amd-pstate",
-+ .attr = amd_pstate_attr,
- };
-
- static int __init amd_pstate_init(void)
-
-
-
-Introduce sysfs attributes to get the different level amd-pstate
-performances.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- drivers/cpufreq/amd-pstate.c | 17 +++++++++++++++++
- 1 file changed, 17 insertions(+)
-
-diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
-index 09c5fd8bd9da..458313cdba93 100644
---- a/drivers/cpufreq/amd-pstate.c
-+++ b/drivers/cpufreq/amd-pstate.c
-@@ -512,12 +512,29 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli
- return sprintf(&buf[0], "%u\n", freq);
- }
-
-+/* In some of ASICs, the highest_perf is not the one in the _CPC table, so we
++/*
++ * In some of ASICs, the highest_perf is not the one in the _CPC table, so we
+ * need to expose it to sysfs.
+ */
+static ssize_t show_amd_pstate_highest_perf(struct cpufreq_policy *policy,
@@ -1421,270 +1319,109 @@ index 09c5fd8bd9da..458313cdba93 100644
+ return sprintf(&buf[0], "%u\n", perf);
+}
+
- cpufreq_freq_attr_ro(amd_pstate_max_freq);
- cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
-
++cpufreq_freq_attr_ro(amd_pstate_max_freq);
++cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
++
+cpufreq_freq_attr_ro(amd_pstate_highest_perf);
+
- static struct freq_attr *amd_pstate_attr[] = {
- &amd_pstate_max_freq,
- &amd_pstate_lowest_nonlinear_freq,
++static struct freq_attr *amd_pstate_attr[] = {
++ &amd_pstate_max_freq,
++ &amd_pstate_lowest_nonlinear_freq,
+ &amd_pstate_highest_perf,
- NULL,
- };
-
-
-
-
-Add AMD P-state capability flag in cpupower to indicate AMD new P-state
-kernel module support on Ryzen processors.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- tools/power/cpupower/utils/helpers/helpers.h | 1 +
- 1 file changed, 1 insertion(+)
-
-diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h
-index 33ffacee7fcb..b4813efdfb00 100644
---- a/tools/power/cpupower/utils/helpers/helpers.h
-+++ b/tools/power/cpupower/utils/helpers/helpers.h
-@@ -73,6 +73,7 @@ enum cpupower_cpu_vendor {X86_VENDOR_UNKNOWN = 0, X86_VENDOR_INTEL,
- #define CPUPOWER_CAP_AMD_HW_PSTATE 0x00000100
- #define CPUPOWER_CAP_AMD_PSTATEDEF 0x00000200
- #define CPUPOWER_CAP_AMD_CPB_MSR 0x00000400
-+#define CPUPOWER_CAP_AMD_PSTATE 0x00000800
-
- #define CPUPOWER_AMD_CPBDIS 0x02000000
-
-
-
-
-The processor with amd-pstate function also supports legacy ACPI
-hardware P-States feature as well. Once driver sets amd-pstate eanbled,
-the processor will respond the finer grain amd-pstate feature instead of
-legacy ACPI P-States. So it introduces the cpupower_amd_pstate_enabled()
-to check whether the current kernel enables amd-pstate or acpi-cpufreq
-module.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- tools/power/cpupower/utils/helpers/helpers.h | 10 ++++++++++
- tools/power/cpupower/utils/helpers/misc.c | 18 ++++++++++++++++++
- 2 files changed, 28 insertions(+)
-
-diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h
-index b4813efdfb00..e03cc97297aa 100644
---- a/tools/power/cpupower/utils/helpers/helpers.h
-+++ b/tools/power/cpupower/utils/helpers/helpers.h
-@@ -11,6 +11,7 @@
-
- #include <libintl.h>
- #include <locale.h>
-+#include <stdbool.h>
-
- #include "helpers/bitmask.h"
- #include <cpupower.h>
-@@ -136,6 +137,12 @@ extern int decode_pstates(unsigned int cpu, int boost_states,
-
- extern int cpufreq_has_boost_support(unsigned int cpu, int *support,
- int *active, int * states);
++ NULL,
++};
+
-+/* AMD P-States stuff **************************/
-+extern bool cpupower_amd_pstate_enabled(void);
++static struct cpufreq_driver amd_pstate_driver = {
++ .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS,
++ .verify = amd_pstate_verify,
++ .target = amd_pstate_target,
++ .init = amd_pstate_cpu_init,
++ .exit = amd_pstate_cpu_exit,
++ .set_boost = amd_pstate_set_boost,
++ .name = "amd-pstate",
++ .attr = amd_pstate_attr,
++};
+
-+/* AMD P-States stuff **************************/
++static int __init amd_pstate_init(void)
++{
++ int ret;
+
- /*
- * CPUID functions returning a single datum
- */
-@@ -168,6 +175,9 @@ static inline int cpufreq_has_boost_support(unsigned int cpu, int *support,
- int *active, int * states)
- { return -1; }
-
-+static inline bool cpupower_amd_pstate_enabled(void)
-+{ return false; }
++ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
++ return -ENODEV;
+
- /* cpuid and cpuinfo helpers **************************/
-
- static inline unsigned int cpuid_eax(unsigned int op) { return 0; };
-diff --git a/tools/power/cpupower/utils/helpers/misc.c b/tools/power/cpupower/utils/helpers/misc.c
-index fc6e34511721..0c483cdefcc2 100644
---- a/tools/power/cpupower/utils/helpers/misc.c
-+++ b/tools/power/cpupower/utils/helpers/misc.c
-@@ -3,9 +3,11 @@
- #include <stdio.h>
- #include <errno.h>
- #include <stdlib.h>
-+#include <string.h>
-
- #include "helpers/helpers.h"
- #include "helpers/sysfs.h"
-+#include "cpufreq.h"
-
- #if defined(__i386__) || defined(__x86_64__)
-
-@@ -83,6 +85,22 @@ int cpupower_intel_set_perf_bias(unsigned int cpu, unsigned int val)
- return 0;
- }
-
-+bool cpupower_amd_pstate_enabled(void)
-+{
-+ char *driver = cpufreq_get_driver(0);
-+ bool ret = false;
++ if (!acpi_cpc_valid()) {
++ pr_debug("the _CPC object is not present in SBIOS\n");
++ return -ENODEV;
++ }
+
-+ if (!driver)
-+ return ret;
++ /* don't keep reloading if cpufreq_driver exists */
++ if (cpufreq_get_current_driver())
++ return -EEXIST;
+
-+ if (!strcmp(driver, "amd-pstate"))
-+ ret = true;
++ /* capability check */
++ if (boot_cpu_has(X86_FEATURE_CPPC)) {
++ pr_debug("AMD CPPC MSR based functionality is supported\n");
++ amd_pstate_driver.adjust_perf = amd_pstate_adjust_perf;
++ } else if (shared_mem) {
++ static_call_update(amd_pstate_enable, cppc_enable);
++ static_call_update(amd_pstate_init_perf, cppc_init_perf);
++ static_call_update(amd_pstate_update_perf, cppc_update_perf);
++ } else {
++ pr_info("This processor supports shared memory solution, you can enable it with amd_pstate.shared_mem=1\n");
++ return -ENODEV;
++ }
+
-+ cpufreq_put_driver(driver);
++ /* enable amd pstate feature */
++ ret = amd_pstate_enable(true);
++ if (ret) {
++ pr_err("failed to enable amd-pstate with return %d\n", ret);
++ return ret;
++ }
++
++ ret = cpufreq_register_driver(&amd_pstate_driver);
++ if (ret)
++ pr_err("failed to register amd_pstate_driver with return %d\n",
++ ret);
+
+ return ret;
+}
+
- #endif /* #if defined(__i386__) || defined(__x86_64__) */
-
- /* get_cpustate
-
-
-
-If kernel starts the amd-pstate module, the cpupower will initial the
-capability flag as CPUPOWER_CAP_AMD_PSTATE. And once amd-pstate
-capability is set, it won't need to set legacy ACPI relative
-capabilities anymore.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- tools/power/cpupower/utils/helpers/cpuid.c | 13 +++++++++++++
- 1 file changed, 13 insertions(+)
-
-diff --git a/tools/power/cpupower/utils/helpers/cpuid.c b/tools/power/cpupower/utils/helpers/cpuid.c
-index 72eb43593180..2a6dc104e76b 100644
---- a/tools/power/cpupower/utils/helpers/cpuid.c
-+++ b/tools/power/cpupower/utils/helpers/cpuid.c
-@@ -149,6 +149,19 @@ int get_cpu_info(struct cpupower_cpu_info *cpu_info)
- if (ext_cpuid_level >= 0x80000008 &&
- cpuid_ebx(0x80000008) & (1 << 4))
- cpu_info->caps |= CPUPOWER_CAP_AMD_RDPRU;
++static void __exit amd_pstate_exit(void)
++{
++ cpufreq_unregister_driver(&amd_pstate_driver);
+
-+ if (cpupower_amd_pstate_enabled()) {
-+ cpu_info->caps |= CPUPOWER_CAP_AMD_PSTATE;
++ amd_pstate_enable(false);
++}
+
-+ /*
-+ * If AMD P-state is enabled, the firmware will treat
-+ * AMD P-state function as high priority.
-+ */
-+ cpu_info->caps &= ~CPUPOWER_CAP_AMD_CPB;
-+ cpu_info->caps &= ~CPUPOWER_CAP_AMD_CPB_MSR;
-+ cpu_info->caps &= ~CPUPOWER_CAP_AMD_HW_PSTATE;
-+ cpu_info->caps &= ~CPUPOWER_CAP_AMD_PSTATEDEF;
-+ }
- }
-
- if (cpu_info->vendor == X86_VENDOR_INTEL) {
-
-
-
-Expose the helper into cpufreq header, then cpufreq driver can use this
-function to get the sysfs value if it has any specific sysfs interfaces.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- tools/power/cpupower/lib/cpufreq.c | 21 +++++++++++++++------
- tools/power/cpupower/lib/cpufreq.h | 12 ++++++++++++
- 2 files changed, 27 insertions(+), 6 deletions(-)
-
-diff --git a/tools/power/cpupower/lib/cpufreq.c b/tools/power/cpupower/lib/cpufreq.c
-index c3b56db8b921..02719cc400a1 100644
---- a/tools/power/cpupower/lib/cpufreq.c
-+++ b/tools/power/cpupower/lib/cpufreq.c
-@@ -83,20 +83,21 @@ static const char *cpufreq_value_files[MAX_CPUFREQ_VALUE_READ_FILES] = {
- [STATS_NUM_TRANSITIONS] = "stats/total_trans"
- };
-
--
--static unsigned long sysfs_cpufreq_get_one_value(unsigned int cpu,
-- enum cpufreq_value which)
-+unsigned long cpufreq_get_sysfs_value_from_table(unsigned int cpu,
-+ const char **table,
-+ unsigned index,
-+ unsigned size)
++module_init(amd_pstate_init);
++module_exit(amd_pstate_exit);
++
++MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>");
++MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver");
++MODULE_LICENSE("GPL");
+diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
+index bc159a9b4a73..92b7ea8d8f5e 100644
+--- a/include/acpi/cppc_acpi.h
++++ b/include/acpi/cppc_acpi.h
+@@ -138,6 +138,7 @@ extern int cppc_get_desired_perf(int cpunum, u64 *desired_perf);
+ extern int cppc_get_nominal_perf(int cpunum, u64 *nominal_perf);
+ extern int cppc_get_perf_ctrs(int cpu, struct cppc_perf_fb_ctrs *perf_fb_ctrs);
+ extern int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls);
++extern int cppc_set_enable(int cpu, bool enable);
+ extern int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps);
+ extern bool acpi_cpc_valid(void);
+ extern int acpi_get_psd_map(unsigned int cpu, struct cppc_cpudata *cpu_data);
+@@ -162,6 +163,10 @@ static inline int cppc_set_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls)
{
- unsigned long value;
- unsigned int len;
- char linebuf[MAX_LINE_LEN];
- char *endp;
-
-- if (which >= MAX_CPUFREQ_VALUE_READ_FILES)
-+ if (!table && !table[index] && index >= size)
- return 0;
-
-- len = sysfs_cpufreq_read_file(cpu, cpufreq_value_files[which],
-- linebuf, sizeof(linebuf));
-+ len = sysfs_cpufreq_read_file(cpu, table[index], linebuf,
-+ sizeof(linebuf));
-
- if (len == 0)
- return 0;
-@@ -109,6 +110,14 @@ static unsigned long sysfs_cpufreq_get_one_value(unsigned int cpu,
- return value;
+ return -ENOTSUPP;
}
-
-+static unsigned long sysfs_cpufreq_get_one_value(unsigned int cpu,
-+ enum cpufreq_value which)
++static inline int cppc_set_enable(int cpu, bool enable)
+{
-+ return cpufreq_get_sysfs_value_from_table(cpu, cpufreq_value_files,
-+ which,
-+ MAX_CPUFREQ_VALUE_READ_FILES);
++ return -ENOTSUPP;
+}
-+
- /* read access to files which contain one string */
-
- enum cpufreq_string {
-diff --git a/tools/power/cpupower/lib/cpufreq.h b/tools/power/cpupower/lib/cpufreq.h
-index 95f4fd9e2656..107668c0c454 100644
---- a/tools/power/cpupower/lib/cpufreq.h
-+++ b/tools/power/cpupower/lib/cpufreq.h
-@@ -203,6 +203,18 @@ int cpufreq_modify_policy_governor(unsigned int cpu, char *governor);
- int cpufreq_set_frequency(unsigned int cpu,
- unsigned long target_frequency);
-
-+/*
-+ * get the sysfs value from specific table
-+ *
-+ * Read the value with the sysfs file name from specific table. Does
-+ * only work if the cpufreq driver has the specific sysfs interfaces.
-+ */
-+
-+unsigned long cpufreq_get_sysfs_value_from_table(unsigned int cpu,
-+ const char **table,
-+ unsigned index,
-+ unsigned size);
-+
- #ifdef __cplusplus
- }
- #endif
-
-
-
-Kernel ACPI subsytem introduced the sysfs attributes for acpi cppc
-library in below path:
-
-/sys/devices/system/cpu/cpuX/acpi_cppc/
-
-And these attributes will be used for amd-pstate driver to provide some
-performance and frequency values.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- tools/power/cpupower/Makefile | 6 +--
- tools/power/cpupower/lib/acpi_cppc.c | 59 ++++++++++++++++++++++++++++
- tools/power/cpupower/lib/acpi_cppc.h | 21 ++++++++++
- 3 files changed, 83 insertions(+), 3 deletions(-)
- create mode 100644 tools/power/cpupower/lib/acpi_cppc.c
- create mode 100644 tools/power/cpupower/lib/acpi_cppc.h
-
+ static inline int cppc_get_perf_caps(int cpu, struct cppc_perf_caps *caps)
+ {
+ return -ENOTSUPP;
diff --git a/tools/power/cpupower/Makefile b/tools/power/cpupower/Makefile
index 3b1594447f29..e9b6de314654 100644
--- a/tools/power/cpupower/Makefile
@@ -1794,172 +1531,78 @@ index 000000000000..576291155224
+ enum acpi_cppc_value which);
+
+#endif /* _ACPI_CPPC_H */
-
-
-
-Introduce the marco definitions and access helper function for
-amd-pstate sysfs interfaces such as each performance goals and frequency
-levels in amd helper file. They will be used to read the sysfs attribute
-from amd-pstate cpufreq driver for cpupower utilities.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- tools/power/cpupower/utils/helpers/amd.c | 30 ++++++++++++++++++++++++
- 1 file changed, 30 insertions(+)
-
-diff --git a/tools/power/cpupower/utils/helpers/amd.c b/tools/power/cpupower/utils/helpers/amd.c
-index 97f2c857048e..14c658daba4b 100644
---- a/tools/power/cpupower/utils/helpers/amd.c
-+++ b/tools/power/cpupower/utils/helpers/amd.c
-@@ -8,7 +8,10 @@
- #include <pci/pci.h>
+diff --git a/tools/power/cpupower/lib/cpufreq.c b/tools/power/cpupower/lib/cpufreq.c
+index c3b56db8b921..c011bca27041 100644
+--- a/tools/power/cpupower/lib/cpufreq.c
++++ b/tools/power/cpupower/lib/cpufreq.c
+@@ -83,20 +83,21 @@ static const char *cpufreq_value_files[MAX_CPUFREQ_VALUE_READ_FILES] = {
+ [STATS_NUM_TRANSITIONS] = "stats/total_trans"
+ };
- #include "helpers/helpers.h"
-+#include "cpufreq.h"
-+#include "acpi_cppc.h"
+-
+-static unsigned long sysfs_cpufreq_get_one_value(unsigned int cpu,
+- enum cpufreq_value which)
++unsigned long cpufreq_get_sysfs_value_from_table(unsigned int cpu,
++ const char **table,
++ unsigned index,
++ unsigned size)
+ {
+ unsigned long value;
+ unsigned int len;
+ char linebuf[MAX_LINE_LEN];
+ char *endp;
-+/* ACPI P-States Helper Functions for AMD Processors ***************/
- #define MSR_AMD_PSTATE_STATUS 0xc0010063
- #define MSR_AMD_PSTATE 0xc0010064
- #define MSR_AMD_PSTATE_LIMIT 0xc0010061
-@@ -146,4 +149,31 @@ int amd_pci_get_num_boost_states(int *active, int *states)
- pci_cleanup(pci_acc);
- return 0;
- }
-+
-+/* ACPI P-States Helper Functions for AMD Processors ***************/
-+
-+/* AMD P-States Helper Functions ***************/
-+enum amd_pstate_value {
-+ AMD_PSTATE_HIGHEST_PERF,
-+ AMD_PSTATE_MAX_FREQ,
-+ AMD_PSTATE_LOWEST_NONLINEAR_FREQ,
-+ MAX_AMD_PSTATE_VALUE_READ_FILES,
-+};
-+
-+static const char *amd_pstate_value_files[MAX_AMD_PSTATE_VALUE_READ_FILES] = {
-+ [AMD_PSTATE_HIGHEST_PERF] = "amd_pstate_highest_perf",
-+ [AMD_PSTATE_MAX_FREQ] = "amd_pstate_max_freq",
-+ [AMD_PSTATE_LOWEST_NONLINEAR_FREQ] = "amd_pstate_lowest_nonlinear_freq",
-+};
-+
-+static unsigned long amd_pstate_get_data(unsigned int cpu,
-+ enum amd_pstate_value value)
-+{
-+ return cpufreq_get_sysfs_value_from_table(cpu,
-+ amd_pstate_value_files,
-+ value,
-+ MAX_AMD_PSTATE_VALUE_READ_FILES);
-+}
-+
-+/* AMD P-States Helper Functions ***************/
- #endif /* defined(__i386__) || defined(__x86_64__) */
-
-
-
-The legacy ACPI hardware P-States function has 3 P-States on ACPI table,
-the CPU frequency only can be switched between the 3 P-States. While the
-processor supports the boost state, it will have another boost state
-that the frequency can be higher than P0 state, and the state can be
-decoded by the function of decode_pstates() and read by
-amd_pci_get_num_boost_states().
-
-However, the new AMD P-States function is different than legacy ACPI
-hardware P-State on AMD processors. That has a finer grain frequency
-range between the highest and lowest frequency. And boost frequency is
-actually the frequency which is mapped on highest performance ratio. The
-similiar previous P0 frequency is mapped on nominal performance ratio.
-If the highest performance on the processor is higher than nominal
-performance, then we think the current processor supports the boost
-state. And it uses amd_pstate_boost_init() to initialize boost for AMD
-P-States function.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- tools/power/cpupower/utils/helpers/amd.c | 18 ++++++++++++++++++
- tools/power/cpupower/utils/helpers/helpers.h | 5 +++++
- tools/power/cpupower/utils/helpers/misc.c | 2 ++
- 3 files changed, 25 insertions(+)
-
-diff --git a/tools/power/cpupower/utils/helpers/amd.c b/tools/power/cpupower/utils/helpers/amd.c
-index 14c658daba4b..bde6065cabf4 100644
---- a/tools/power/cpupower/utils/helpers/amd.c
-+++ b/tools/power/cpupower/utils/helpers/amd.c
-@@ -175,5 +175,23 @@ static unsigned long amd_pstate_get_data(unsigned int cpu,
- MAX_AMD_PSTATE_VALUE_READ_FILES);
+- if (which >= MAX_CPUFREQ_VALUE_READ_FILES)
++ if (!table || index >= size || !table[index])
+ return 0;
+
+- len = sysfs_cpufreq_read_file(cpu, cpufreq_value_files[which],
+- linebuf, sizeof(linebuf));
++ len = sysfs_cpufreq_read_file(cpu, table[index], linebuf,
++ sizeof(linebuf));
+
+ if (len == 0)
+ return 0;
+@@ -109,6 +110,14 @@ static unsigned long sysfs_cpufreq_get_one_value(unsigned int cpu,
+ return value;
}
-+void amd_pstate_boost_init(unsigned int cpu, int *support, int *active)
++static unsigned long sysfs_cpufreq_get_one_value(unsigned int cpu,
++ enum cpufreq_value which)
+{
-+ unsigned long highest_perf, nominal_perf, cpuinfo_min,
-+ cpuinfo_max, amd_pstate_max;
-+
-+ highest_perf = amd_pstate_get_data(cpu, AMD_PSTATE_HIGHEST_PERF);
-+ nominal_perf = acpi_cppc_get_data(cpu, NOMINAL_PERF);
-+
-+ *support = highest_perf > nominal_perf ? 1 : 0;
-+ if (!(*support))
-+ return;
-+
-+ cpufreq_get_hardware_limits(cpu, &cpuinfo_min, &cpuinfo_max);
-+ amd_pstate_max = amd_pstate_get_data(cpu, AMD_PSTATE_MAX_FREQ);
-+
-+ *active = cpuinfo_max == amd_pstate_max ? 1 : 0;
++ return cpufreq_get_sysfs_value_from_table(cpu, cpufreq_value_files,
++ which,
++ MAX_CPUFREQ_VALUE_READ_FILES);
+}
+
- /* AMD P-States Helper Functions ***************/
- #endif /* defined(__i386__) || defined(__x86_64__) */
-diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h
-index e03cc97297aa..c03925bea655 100644
---- a/tools/power/cpupower/utils/helpers/helpers.h
-+++ b/tools/power/cpupower/utils/helpers/helpers.h
-@@ -140,6 +140,8 @@ extern int cpufreq_has_boost_support(unsigned int cpu, int *support,
-
- /* AMD P-States stuff **************************/
- extern bool cpupower_amd_pstate_enabled(void);
-+extern void amd_pstate_boost_init(unsigned int cpu,
-+ int *support, int *active);
-
- /* AMD P-States stuff **************************/
-
-@@ -177,6 +179,9 @@ static inline int cpufreq_has_boost_support(unsigned int cpu, int *support,
-
- static inline bool cpupower_amd_pstate_enabled(void)
- { return false; }
-+static void amd_pstate_boost_init(unsigned int cpu,
-+ int *support, int *active)
-+{ return; }
+ /* read access to files which contain one string */
- /* cpuid and cpuinfo helpers **************************/
+ enum cpufreq_string {
+diff --git a/tools/power/cpupower/lib/cpufreq.h b/tools/power/cpupower/lib/cpufreq.h
+index 95f4fd9e2656..107668c0c454 100644
+--- a/tools/power/cpupower/lib/cpufreq.h
++++ b/tools/power/cpupower/lib/cpufreq.h
+@@ -203,6 +203,18 @@ int cpufreq_modify_policy_governor(unsigned int cpu, char *governor);
+ int cpufreq_set_frequency(unsigned int cpu,
+ unsigned long target_frequency);
-diff --git a/tools/power/cpupower/utils/helpers/misc.c b/tools/power/cpupower/utils/helpers/misc.c
-index 0c483cdefcc2..e0d3145434d3 100644
---- a/tools/power/cpupower/utils/helpers/misc.c
-+++ b/tools/power/cpupower/utils/helpers/misc.c
-@@ -41,6 +41,8 @@ int cpufreq_has_boost_support(unsigned int cpu, int *support, int *active,
- if (ret)
- return ret;
- }
-+ } else if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_PSTATE) {
-+ amd_pstate_boost_init(cpu, support, active);
- } else if (cpupower_cpu_info.caps & CPUPOWER_CAP_INTEL_IDA)
- *support = *active = 1;
- return 0;
-
-
-
-The print_speed can be as a common function, and expose it into misc
-helper header. Then it can be used on other helper files as well.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- tools/power/cpupower/utils/cpufreq-info.c | 59 ++++----------------
- tools/power/cpupower/utils/helpers/helpers.h | 1 +
- tools/power/cpupower/utils/helpers/misc.c | 42 ++++++++++++++
- 3 files changed, 54 insertions(+), 48 deletions(-)
-
++/*
++ * get the sysfs value from specific table
++ *
++ * Read the value with the sysfs file name from specific table. Does
++ * only work if the cpufreq driver has the specific sysfs interfaces.
++ */
++
++unsigned long cpufreq_get_sysfs_value_from_table(unsigned int cpu,
++ const char **table,
++ unsigned index,
++ unsigned size);
++
+ #ifdef __cplusplus
+ }
+ #endif
diff --git a/tools/power/cpupower/utils/cpufreq-info.c b/tools/power/cpupower/utils/cpufreq-info.c
-index f9895e31ff5a..b429454bf3ae 100644
+index f9895e31ff5a..f828f3c35a6f 100644
--- a/tools/power/cpupower/utils/cpufreq-info.c
+++ b/tools/power/cpupower/utils/cpufreq-info.c
@@ -84,43 +84,6 @@ static void proc_cpufreq_output(void)
@@ -2006,7 +1649,23 @@ index f9895e31ff5a..b429454bf3ae 100644
static void print_duration(unsigned long duration)
{
unsigned long tmp;
-@@ -254,11 +217,11 @@ static int get_boost_mode(unsigned int cpu)
+@@ -183,9 +146,12 @@ static int get_boost_mode_x86(unsigned int cpu)
+ printf(_(" Supported: %s\n"), support ? _("yes") : _("no"));
+ printf(_(" Active: %s\n"), active ? _("yes") : _("no"));
+
+- if ((cpupower_cpu_info.vendor == X86_VENDOR_AMD &&
+- cpupower_cpu_info.family >= 0x10) ||
+- cpupower_cpu_info.vendor == X86_VENDOR_HYGON) {
++ if (cpupower_cpu_info.vendor == X86_VENDOR_AMD &&
++ cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_PSTATE) {
++ amd_pstate_show_perf_and_freq(cpu, no_rounding);
++ } else if ((cpupower_cpu_info.vendor == X86_VENDOR_AMD &&
++ cpupower_cpu_info.family >= 0x10) ||
++ cpupower_cpu_info.vendor == X86_VENDOR_HYGON) {
+ ret = decode_pstates(cpu, b_states, pstates, &pstate_no);
+ if (ret)
+ return ret;
+@@ -254,11 +220,11 @@ static int get_boost_mode(unsigned int cpu)
if (freqs) {
printf(_(" boost frequency steps: "));
while (freqs->next) {
@@ -2020,7 +1679,7 @@ index f9895e31ff5a..b429454bf3ae 100644
printf("\n");
cpufreq_put_available_frequencies(freqs);
}
-@@ -277,7 +240,7 @@ static int get_freq_kernel(unsigned int cpu, unsigned int human)
+@@ -277,7 +243,7 @@ static int get_freq_kernel(unsigned int cpu, unsigned int human)
return -EINVAL;
}
if (human) {
@@ -2029,7 +1688,7 @@ index f9895e31ff5a..b429454bf3ae 100644
} else
printf("%lu", freq);
printf(_(" (asserted by call to kernel)\n"));
-@@ -296,7 +259,7 @@ static int get_freq_hardware(unsigned int cpu, unsigned int human)
+@@ -296,7 +262,7 @@ static int get_freq_hardware(unsigned int cpu, unsigned int human)
return -EINVAL;
}
if (human) {
@@ -2038,7 +1697,7 @@ index f9895e31ff5a..b429454bf3ae 100644
} else
printf("%lu", freq);
printf(_(" (asserted by call to hardware)\n"));
-@@ -316,9 +279,9 @@ static int get_hardware_limits(unsigned int cpu, unsigned int human)
+@@ -316,9 +282,9 @@ static int get_hardware_limits(unsigned int cpu, unsigned int human)
if (human) {
printf(_(" hardware limits: "));
@@ -2050,7 +1709,7 @@ index f9895e31ff5a..b429454bf3ae 100644
printf("\n");
} else {
printf("%lu %lu\n", min, max);
-@@ -350,9 +313,9 @@ static int get_policy(unsigned int cpu)
+@@ -350,9 +316,9 @@ static int get_policy(unsigned int cpu)
return -EINVAL;
}
printf(_(" current policy: frequency should be within "));
@@ -2062,7 +1721,7 @@ index f9895e31ff5a..b429454bf3ae 100644
printf(".\n ");
printf(_("The governor \"%s\" may decide which speed to use\n"
-@@ -436,7 +399,7 @@ static int get_freq_stats(unsigned int cpu, unsigned int human)
+@@ -436,7 +402,7 @@ static int get_freq_stats(unsigned int cpu, unsigned int human)
struct cpufreq_stats *stats = cpufreq_get_stats(cpu, &total_time);
while (stats) {
if (human) {
@@ -2071,7 +1730,7 @@ index f9895e31ff5a..b429454bf3ae 100644
printf(":%.2f%%",
(100.0 * stats->time_in_state) / total_time);
} else
-@@ -486,11 +449,11 @@ static void debug_output_one(unsigned int cpu)
+@@ -486,11 +452,11 @@ static void debug_output_one(unsigned int cpu)
if (freqs) {
printf(_(" available frequency steps: "));
while (freqs->next) {
@@ -2085,11 +1744,169 @@ index f9895e31ff5a..b429454bf3ae 100644
printf("\n");
cpufreq_put_available_frequencies(freqs);
}
+diff --git a/tools/power/cpupower/utils/helpers/amd.c b/tools/power/cpupower/utils/helpers/amd.c
+index 97f2c857048e..a1115891d76d 100644
+--- a/tools/power/cpupower/utils/helpers/amd.c
++++ b/tools/power/cpupower/utils/helpers/amd.c
+@@ -8,7 +8,10 @@
+ #include <pci/pci.h>
+
+ #include "helpers/helpers.h"
++#include "cpufreq.h"
++#include "acpi_cppc.h"
+
++/* ACPI P-States Helper Functions for AMD Processors ***************/
+ #define MSR_AMD_PSTATE_STATUS 0xc0010063
+ #define MSR_AMD_PSTATE 0xc0010064
+ #define MSR_AMD_PSTATE_LIMIT 0xc0010061
+@@ -146,4 +149,77 @@ int amd_pci_get_num_boost_states(int *active, int *states)
+ pci_cleanup(pci_acc);
+ return 0;
+ }
++
++/* ACPI P-States Helper Functions for AMD Processors ***************/
++
++/* AMD P-States Helper Functions ***************/
++enum amd_pstate_value {
++ AMD_PSTATE_HIGHEST_PERF,
++ AMD_PSTATE_MAX_FREQ,
++ AMD_PSTATE_LOWEST_NONLINEAR_FREQ,
++ MAX_AMD_PSTATE_VALUE_READ_FILES,
++};
++
++static const char *amd_pstate_value_files[MAX_AMD_PSTATE_VALUE_READ_FILES] = {
++ [AMD_PSTATE_HIGHEST_PERF] = "amd_pstate_highest_perf",
++ [AMD_PSTATE_MAX_FREQ] = "amd_pstate_max_freq",
++ [AMD_PSTATE_LOWEST_NONLINEAR_FREQ] = "amd_pstate_lowest_nonlinear_freq",
++};
++
++static unsigned long amd_pstate_get_data(unsigned int cpu,
++ enum amd_pstate_value value)
++{
++ return cpufreq_get_sysfs_value_from_table(cpu,
++ amd_pstate_value_files,
++ value,
++ MAX_AMD_PSTATE_VALUE_READ_FILES);
++}
++
++void amd_pstate_boost_init(unsigned int cpu, int *support, int *active)
++{
++ unsigned long highest_perf, nominal_perf, cpuinfo_min,
++ cpuinfo_max, amd_pstate_max;
++
++ highest_perf = amd_pstate_get_data(cpu, AMD_PSTATE_HIGHEST_PERF);
++ nominal_perf = acpi_cppc_get_data(cpu, NOMINAL_PERF);
++
++ *support = highest_perf > nominal_perf ? 1 : 0;
++ if (!(*support))
++ return;
++
++ cpufreq_get_hardware_limits(cpu, &cpuinfo_min, &cpuinfo_max);
++ amd_pstate_max = amd_pstate_get_data(cpu, AMD_PSTATE_MAX_FREQ);
++
++ *active = cpuinfo_max == amd_pstate_max ? 1 : 0;
++}
++
++void amd_pstate_show_perf_and_freq(unsigned int cpu, int no_rounding)
++{
++ printf(_(" AMD PSTATE Highest Performance: %lu. Maximum Frequency: "),
++ amd_pstate_get_data(cpu, AMD_PSTATE_HIGHEST_PERF));
++ /* If boost isn't active, the cpuinfo_max doesn't indicate real max
++ * frequency. So we read it back from amd-pstate sysfs entry.
++ */
++ print_speed(amd_pstate_get_data(cpu, AMD_PSTATE_MAX_FREQ), no_rounding);
++ printf(".\n");
++
++ printf(_(" AMD PSTATE Nominal Performance: %lu. Nominal Frequency: "),
++ acpi_cppc_get_data(cpu, NOMINAL_PERF));
++ print_speed(acpi_cppc_get_data(cpu, NOMINAL_FREQ) * 1000,
++ no_rounding);
++ printf(".\n");
++
++ printf(_(" AMD PSTATE Lowest Non-linear Performance: %lu. Lowest Non-linear Frequency: "),
++ acpi_cppc_get_data(cpu, LOWEST_NONLINEAR_PERF));
++ print_speed(amd_pstate_get_data(cpu, AMD_PSTATE_LOWEST_NONLINEAR_FREQ),
++ no_rounding);
++ printf(".\n");
++
++ printf(_(" AMD PSTATE Lowest Performance: %lu. Lowest Frequency: "),
++ acpi_cppc_get_data(cpu, LOWEST_PERF));
++ print_speed(acpi_cppc_get_data(cpu, LOWEST_FREQ) * 1000, no_rounding);
++ printf(".\n");
++}
++
++/* AMD P-States Helper Functions ***************/
+ #endif /* defined(__i386__) || defined(__x86_64__) */
+diff --git a/tools/power/cpupower/utils/helpers/cpuid.c b/tools/power/cpupower/utils/helpers/cpuid.c
+index 72eb43593180..2a6dc104e76b 100644
+--- a/tools/power/cpupower/utils/helpers/cpuid.c
++++ b/tools/power/cpupower/utils/helpers/cpuid.c
+@@ -149,6 +149,19 @@ int get_cpu_info(struct cpupower_cpu_info *cpu_info)
+ if (ext_cpuid_level >= 0x80000008 &&
+ cpuid_ebx(0x80000008) & (1 << 4))
+ cpu_info->caps |= CPUPOWER_CAP_AMD_RDPRU;
++
++ if (cpupower_amd_pstate_enabled()) {
++ cpu_info->caps |= CPUPOWER_CAP_AMD_PSTATE;
++
++ /*
++ * If AMD P-state is enabled, the firmware will treat
++ * AMD P-state function as high priority.
++ */
++ cpu_info->caps &= ~CPUPOWER_CAP_AMD_CPB;
++ cpu_info->caps &= ~CPUPOWER_CAP_AMD_CPB_MSR;
++ cpu_info->caps &= ~CPUPOWER_CAP_AMD_HW_PSTATE;
++ cpu_info->caps &= ~CPUPOWER_CAP_AMD_PSTATEDEF;
++ }
+ }
+
+ if (cpu_info->vendor == X86_VENDOR_INTEL) {
diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h
-index c03925bea655..fbbfa6047c83 100644
+index b4813efdfb00..5f6862502dbf 100644
--- a/tools/power/cpupower/utils/helpers/helpers.h
+++ b/tools/power/cpupower/utils/helpers/helpers.h
-@@ -200,5 +200,6 @@ extern struct bitmask *offline_cpus;
+@@ -11,6 +11,7 @@
+
+ #include <libintl.h>
+ #include <locale.h>
++#include <stdbool.h>
+
+ #include "helpers/bitmask.h"
+ #include <cpupower.h>
+@@ -136,6 +137,16 @@ extern int decode_pstates(unsigned int cpu, int boost_states,
+
+ extern int cpufreq_has_boost_support(unsigned int cpu, int *support,
+ int *active, int * states);
++
++/* AMD P-States stuff **************************/
++extern bool cpupower_amd_pstate_enabled(void);
++extern void amd_pstate_boost_init(unsigned int cpu,
++ int *support, int *active);
++extern void amd_pstate_show_perf_and_freq(unsigned int cpu,
++ int no_rounding);
++
++/* AMD P-States stuff **************************/
++
+ /*
+ * CPUID functions returning a single datum
+ */
+@@ -168,6 +179,15 @@ static inline int cpufreq_has_boost_support(unsigned int cpu, int *support,
+ int *active, int * states)
+ { return -1; }
+
++static inline bool cpupower_amd_pstate_enabled(void)
++{ return false; }
++static void amd_pstate_boost_init(unsigned int cpu,
++ int *support, int *active)
++{ return; }
++static inline void amd_pstate_show_perf_and_freq(unsigned int cpu,
++ int no_rounding)
++{ return; }
++
+ /* cpuid and cpuinfo helpers **************************/
+
+ static inline unsigned int cpuid_eax(unsigned int op) { return 0; };
+@@ -185,5 +205,6 @@ extern struct bitmask *offline_cpus;
void get_cpustate(void);
void print_online_cpus(void);
void print_offline_cpus(void);
@@ -2097,10 +1914,54 @@ index c03925bea655..fbbfa6047c83 100644
#endif /* __CPUPOWERUTILS_HELPERS__ */
diff --git a/tools/power/cpupower/utils/helpers/misc.c b/tools/power/cpupower/utils/helpers/misc.c
-index e0d3145434d3..d693c96cd09c 100644
+index fc6e34511721..d693c96cd09c 100644
--- a/tools/power/cpupower/utils/helpers/misc.c
+++ b/tools/power/cpupower/utils/helpers/misc.c
-@@ -164,3 +164,45 @@ void print_offline_cpus(void)
+@@ -3,9 +3,11 @@
+ #include <stdio.h>
+ #include <errno.h>
+ #include <stdlib.h>
++#include <string.h>
+
+ #include "helpers/helpers.h"
+ #include "helpers/sysfs.h"
++#include "cpufreq.h"
+
+ #if defined(__i386__) || defined(__x86_64__)
+
+@@ -39,6 +41,8 @@ int cpufreq_has_boost_support(unsigned int cpu, int *support, int *active,
+ if (ret)
+ return ret;
+ }
++ } else if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_PSTATE) {
++ amd_pstate_boost_init(cpu, support, active);
+ } else if (cpupower_cpu_info.caps & CPUPOWER_CAP_INTEL_IDA)
+ *support = *active = 1;
+ return 0;
+@@ -83,6 +87,22 @@ int cpupower_intel_set_perf_bias(unsigned int cpu, unsigned int val)
+ return 0;
+ }
+
++bool cpupower_amd_pstate_enabled(void)
++{
++ char *driver = cpufreq_get_driver(0);
++ bool ret = false;
++
++ if (!driver)
++ return ret;
++
++ if (!strcmp(driver, "amd-pstate"))
++ ret = true;
++
++ cpufreq_put_driver(driver);
++
++ return ret;
++}
++
+ #endif /* #if defined(__i386__) || defined(__x86_64__) */
+
+ /* get_cpustate
+@@ -144,3 +164,45 @@ void print_offline_cpus(void)
printf(_("cpupower set operation was not performed on them\n"));
}
}
@@ -2146,501 +2007,5 @@ index e0d3145434d3..d693c96cd09c 100644
+
+ return;
+}
-
-
-
-amd-pstate kernel module is using the fine grain frequency instead of
-acpi hardware pstate. So the performance and frequency values should be
-printed in frequency-info.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- tools/power/cpupower/utils/cpufreq-info.c | 9 ++++---
- tools/power/cpupower/utils/helpers/amd.c | 28 ++++++++++++++++++++
- tools/power/cpupower/utils/helpers/helpers.h | 5 ++++
- 3 files changed, 39 insertions(+), 3 deletions(-)
-
-diff --git a/tools/power/cpupower/utils/cpufreq-info.c b/tools/power/cpupower/utils/cpufreq-info.c
-index b429454bf3ae..f828f3c35a6f 100644
---- a/tools/power/cpupower/utils/cpufreq-info.c
-+++ b/tools/power/cpupower/utils/cpufreq-info.c
-@@ -146,9 +146,12 @@ static int get_boost_mode_x86(unsigned int cpu)
- printf(_(" Supported: %s\n"), support ? _("yes") : _("no"));
- printf(_(" Active: %s\n"), active ? _("yes") : _("no"));
-
-- if ((cpupower_cpu_info.vendor == X86_VENDOR_AMD &&
-- cpupower_cpu_info.family >= 0x10) ||
-- cpupower_cpu_info.vendor == X86_VENDOR_HYGON) {
-+ if (cpupower_cpu_info.vendor == X86_VENDOR_AMD &&
-+ cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_PSTATE) {
-+ amd_pstate_show_perf_and_freq(cpu, no_rounding);
-+ } else if ((cpupower_cpu_info.vendor == X86_VENDOR_AMD &&
-+ cpupower_cpu_info.family >= 0x10) ||
-+ cpupower_cpu_info.vendor == X86_VENDOR_HYGON) {
- ret = decode_pstates(cpu, b_states, pstates, &pstate_no);
- if (ret)
- return ret;
-diff --git a/tools/power/cpupower/utils/helpers/amd.c b/tools/power/cpupower/utils/helpers/amd.c
-index bde6065cabf4..a1115891d76d 100644
---- a/tools/power/cpupower/utils/helpers/amd.c
-+++ b/tools/power/cpupower/utils/helpers/amd.c
-@@ -193,5 +193,33 @@ void amd_pstate_boost_init(unsigned int cpu, int *support, int *active)
- *active = cpuinfo_max == amd_pstate_max ? 1 : 0;
- }
-
-+void amd_pstate_show_perf_and_freq(unsigned int cpu, int no_rounding)
-+{
-+ printf(_(" AMD PSTATE Highest Performance: %lu. Maximum Frequency: "),
-+ amd_pstate_get_data(cpu, AMD_PSTATE_HIGHEST_PERF));
-+ /* If boost isn't active, the cpuinfo_max doesn't indicate real max
-+ * frequency. So we read it back from amd-pstate sysfs entry.
-+ */
-+ print_speed(amd_pstate_get_data(cpu, AMD_PSTATE_MAX_FREQ), no_rounding);
-+ printf(".\n");
-+
-+ printf(_(" AMD PSTATE Nominal Performance: %lu. Nominal Frequency: "),
-+ acpi_cppc_get_data(cpu, NOMINAL_PERF));
-+ print_speed(acpi_cppc_get_data(cpu, NOMINAL_FREQ) * 1000,
-+ no_rounding);
-+ printf(".\n");
-+
-+ printf(_(" AMD PSTATE Lowest Non-linear Performance: %lu. Lowest Non-linear Frequency: "),
-+ acpi_cppc_get_data(cpu, LOWEST_NONLINEAR_PERF));
-+ print_speed(amd_pstate_get_data(cpu, AMD_PSTATE_LOWEST_NONLINEAR_FREQ),
-+ no_rounding);
-+ printf(".\n");
-+
-+ printf(_(" AMD PSTATE Lowest Performance: %lu. Lowest Frequency: "),
-+ acpi_cppc_get_data(cpu, LOWEST_PERF));
-+ print_speed(acpi_cppc_get_data(cpu, LOWEST_FREQ) * 1000, no_rounding);
-+ printf(".\n");
-+}
-+
- /* AMD P-States Helper Functions ***************/
- #endif /* defined(__i386__) || defined(__x86_64__) */
-diff --git a/tools/power/cpupower/utils/helpers/helpers.h b/tools/power/cpupower/utils/helpers/helpers.h
-index fbbfa6047c83..5f6862502dbf 100644
---- a/tools/power/cpupower/utils/helpers/helpers.h
-+++ b/tools/power/cpupower/utils/helpers/helpers.h
-@@ -142,6 +142,8 @@ extern int cpufreq_has_boost_support(unsigned int cpu, int *support,
- extern bool cpupower_amd_pstate_enabled(void);
- extern void amd_pstate_boost_init(unsigned int cpu,
- int *support, int *active);
-+extern void amd_pstate_show_perf_and_freq(unsigned int cpu,
-+ int no_rounding);
-
- /* AMD P-States stuff **************************/
-
-@@ -182,6 +184,9 @@ static inline bool cpupower_amd_pstate_enabled(void)
- static void amd_pstate_boost_init(unsigned int cpu,
- int *support, int *active)
- { return; }
-+static inline void amd_pstate_show_perf_and_freq(unsigned int cpu,
-+ int no_rounding)
-+{ return; }
-
- /* cpuid and cpuinfo helpers **************************/
-
-
-
-
-Introduce the amd-pstate driver design and implementation.
-
-Signed-off-by: Huang Rui <ray.huang@amd.com>
----
- Documentation/admin-guide/pm/amd-pstate.rst | 373 ++++++++++++++++++
- .../admin-guide/pm/working-state.rst | 1 +
- 2 files changed, 374 insertions(+)
- create mode 100644 Documentation/admin-guide/pm/amd-pstate.rst
-
-diff --git a/Documentation/admin-guide/pm/amd-pstate.rst b/Documentation/admin-guide/pm/amd-pstate.rst
-new file mode 100644
-index 000000000000..24a88476fc69
---- /dev/null
-+++ b/Documentation/admin-guide/pm/amd-pstate.rst
-@@ -0,0 +1,373 @@
-+.. SPDX-License-Identifier: GPL-2.0
-+.. include:: <isonum.txt>
-+
-+===============================================
-+``amd-pstate`` CPU Performance Scaling Driver
-+===============================================
-+
-+:Copyright: |copy| 2021 Advanced Micro Devices, Inc.
-+
-+:Author: Huang Rui <ray.huang@amd.com>
-+
-+
-+Introduction
-+===================
-+
-+``amd-pstate`` is the AMD CPU performance scaling driver that introduces a
-+new CPU frequency control mechanism on modern AMD APU and CPU series in
-+Linux kernel. The new mechanism is based on Collaborative Processor
-+Performance Control (CPPC) which provides finer grain frequency management
-+than legacy ACPI hardware P-States. Current AMD CPU/APU platforms are using
-+the ACPI P-states driver to manage CPU frequency and clocks with switching
-+only in 3 P-states. CPPC replaces the ACPI P-states controls, allows a
-+flexible, low-latency interface for the Linux kernel to directly
-+communicate the performance hints to hardware.
-+
-+``amd-pstate`` leverages the Linux kernel governors such as ``schedutil``,
-+``ondemand``, etc. to manage the performance hints which are provided by
-+CPPC hardware functionality that internally follows the hardware
-+specification (for details refer to AMD64 Architecture Programmer's Manual
-+Volume 2: System Programming [1]_). Currently ``amd-pstate`` supports basic
-+frequency control function according to kernel governors on some of the
-+Zen2 and Zen3 processors, and we will implement more AMD specific functions
-+in future after we verify them on the hardware and SBIOS.
-+
-+
-+AMD CPPC Overview
-+=======================
-+
-+Collaborative Processor Performance Control (CPPC) interface enumerates a
-+continuous, abstract, and unit-less performance value in a scale that is
-+not tied to a specific performance state / frequency. This is an ACPI
-+standard [2]_ which software can specify application performance goals and
-+hints as a relative target to the infrastructure limits. AMD processors
-+provides the low latency register model (MSR) instead of AML code
-+interpreter for performance adjustments. ``amd-pstate`` will initialize a
-+``struct cpufreq_driver`` instance ``amd_pstate_driver`` with the callbacks
-+to manage each performance update behavior. ::
-+
-+ Highest Perf ------>+-----------------------+ +-----------------------+
-+ | | | |
-+ | | | |
-+ | | Max Perf ---->| |
-+ | | | |
-+ | | | |
-+ Nominal Perf ------>+-----------------------+ +-----------------------+
-+ | | | |
-+ | | | |
-+ | | | |
-+ | | | |
-+ | | | |
-+ | | | |
-+ | | Desired Perf ---->| |
-+ | | | |
-+ | | | |
-+ | | | |
-+ | | | |
-+ | | | |
-+ | | | |
-+ | | | |
-+ | | | |
-+ | | | |
-+ Lowest non- | | | |
-+ linear perf ------>+-----------------------+ +-----------------------+
-+ | | | |
-+ | | Lowest perf ---->| |
-+ | | | |
-+ Lowest perf ------>+-----------------------+ +-----------------------+
-+ | | | |
-+ | | | |
-+ | | | |
-+ 0 ------>+-----------------------+ +-----------------------+
-+
-+ AMD P-States Performance Scale
-+
-+
-+.. _perf_cap:
-+
-+AMD CPPC Performance Capability
-+--------------------------------
-+
-+Highest Performance (RO)
-+.........................
-+
-+It is the absolute maximum performance an individual processor may reach,
-+assuming ideal conditions. This performance level may not be sustainable
-+for long durations and may only be achievable if other platform components
-+are in a specific state; for example, it may require other processors be in
-+an idle state. This would be equivalent to the highest frequencies
-+supported by the processor.
-+
-+Nominal (Guaranteed) Performance (RO)
-+......................................
-+
-+It is the maximum sustained performance level of the processor, assuming
-+ideal operating conditions. In absence of an external constraint (power,
-+thermal, etc.) this is the performance level the processor is expected to
-+be able to maintain continuously. All cores/processors are expected to be
-+able to sustain their nominal performance state simultaneously.
-+
-+Lowest non-linear Performance (RO)
-+...................................
-+
-+It is the lowest performance level at which nonlinear power savings are
-+achieved, for example, due to the combined effects of voltage and frequency
-+scaling. Above this threshold, lower performance levels should be generally
-+more energy efficient than higher performance levels. This register
-+effectively conveys the most efficient performance level to ``amd-pstate``.
-+
-+Lowest Performance (RO)
-+........................
-+
-+It is the absolute lowest performance level of the processor. Selecting a
-+performance level lower than the lowest nonlinear performance level may
-+cause an efficiency penalty but should reduce the instantaneous power
-+consumption of the processor.
-+
-+AMD CPPC Performance Control
-+------------------------------
-+
-+``amd-pstate`` passes performance goals through these registers. The
-+register drives the behavior of the desired performance target.
-+
-+Minimum requested performance (RW)
-+...................................
-+
-+``amd-pstate`` specifies the minimum allowed performance level.
-+
-+Maximum requested performance (RW)
-+...................................
-+
-+``amd-pstate`` specifies a limit the maximum performance that is expected
-+to be supplied by the hardware.
-+
-+Desired performance target (RW)
-+...................................
-+
-+``amd-pstate`` specifies a desired target in the CPPC performance scale as
-+a relative number. This can be expressed as percentage of nominal
-+performance (infrastructure max). Below the nominal sustained performance
-+level, desired performance expresses the average performance level of the
-+processor subject to hardware. Above the nominal performance level,
-+processor must provide at least nominal performance requested and go higher
-+if current operating conditions allow.
-+
-+Energy Performance Preference (EPP) (RW)
-+.........................................
-+
-+Provides a hint to the hardware if software wants to bias toward performance
-+(0x0) or energy efficiency (0xff).
-+
-+
-+Key Governors Support
-+=======================
-+
-+``amd-pstate`` can be used with all the (generic) scaling governors listed
-+by the ``scaling_available_governors`` policy attribute in ``sysfs``. Then,
-+it is responsible for the configuration of policy objects corresponding to
-+CPUs and provides the ``CPUFreq`` core (and the scaling governors attached
-+to the policy objects) with accurate information on the maximum and minimum
-+operating frequencies supported by the hardware. Users can check the
-+``scaling_cur_freq`` information comes from the ``CPUFreq`` core.
-+
-+``amd-pstate`` mainly supports ``schedutil`` and ``ondemand`` for dynamic
-+frequency control. It is to fine tune the processor configuration on
-+``amd-pstate`` to the ``schedutil`` with CPU CFS scheduler. ``amd-pstate``
-+registers adjust_perf callback to implement the CPPC similar performance
-+update behavior. It is initialized by ``sugov_start`` and then populate the
-+CPU's update_util_data pointer to assign ``sugov_update_single_perf`` as
-+the utilization update callback function in CPU scheduler. CPU scheduler
-+will call ``cpufreq_update_util`` and assign the target performance
-+according to the ``struct sugov_cpu`` that utilization update belongs to.
-+Then ``amd-pstate`` updates the desired performance according to the CPU
-+scheduler assigned.
-+
-+
-+Processor Support
-+=======================
-+
-+The ``amd-pstate`` initialization will fail if the _CPC in ACPI SBIOS is
-+not existed at the detected processor, and it uses ``acpi_cpc_valid`` to
-+check the _CPC existence. All Zen based processors support legacy ACPI
-+hardware P-States function, so while the ``amd-pstate`` fails to be
-+initialized, the kernel will fall back to initialize ``acpi-cpufreq``
-+driver.
-+
-+There are two types of hardware implementations for ``amd-pstate``: one is
-+`Full MSR Support <perf_cap_>`_ and another is `Shared Memory Support
-+<perf_cap_>`_. It can use :c:macro:`X86_FEATURE_CPPC` feature flag (for
-+details refer to Processor Programming Reference (PPR) for AMD Family
-+19h Model 21h, Revision B0 Processors [3]_) to indicate the different
-+types. ``amd-pstate`` is to register different ``amd_pstate_perf_funcs``
-+instances for different hardware implementations.
-+
-+Currently, some of Zen2 and Zen3 processors support ``amd-pstate``. In the
-+future, it will be supported on more and more AMD processors.
-+
-+Full MSR Support
-+-----------------
-+
-+Some new Zen3 processors such as Cezanne provide the MSR registers directly
-+while the :c:macro:`X86_FEATURE_CPPC` CPU feature flag is set.
-+``amd-pstate`` can handle the MSR register to implement the fast switch
-+function in ``CPUFreq`` that can shrink latency of frequency control on the
-+interrupt context.
-+
-+Shared Memory Support
-+----------------------
-+
-+If :c:macro:`X86_FEATURE_CPPC` CPU feature flag is not set, that means the
-+processor supports shared memory solution. In this case, ``amd-pstate``
-+uses the ``cppc_acpi`` helper methods to implement the callback functions
-+of ``amd_pstate_perf_funcs``.
-+
-+
-+AMD P-States and ACPI hardware P-States always can be supported in one
-+processor. But AMD P-States has the higher priority and if it is enabled
-+with :c:macro:`MSR_AMD_CPPC_ENABLE` or ``cppc_set_enable``, it will respond
-+to the request from AMD P-States.
-+
-+
-+User Space Interface in ``sysfs``
-+==================================
-+
-+``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
-+control its functionality at the system level. They located in the
-+``/sys/devices/system/cpu/cpufreq/policyX/`` directory and affect all CPUs. ::
-+
-+ root@hr-test1:/home/ray# ls /sys/devices/system/cpu/cpufreq/policy0/*amd*
-+ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_highest_perf
-+ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_lowest_nonlinear_freq
-+ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_lowest_nonlinear_perf
-+ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_lowest_perf
-+ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_max_freq
-+ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_min_freq
-+ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_nominal_freq
-+ /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_nominal_perf
-+
-+
-+``amd_pstate_highest_perf / amd_pstate_max_freq``
-+
-+Maximum CPPC performance and CPU frequency that the driver is allowed to
-+set in percent of the maximum supported CPPC performance level (the highest
-+performance supported in `AMD CPPC Performance Capability <perf_cap_>`_).
-+This attribute is read-only.
-+
-+``amd_pstate_nominal_perf / amd_pstate_nominal_freq``
-+
-+Nominal CPPC performance and CPU frequency that the driver is allowed to
-+set in percent of the maximum supported CPPC performance level (Please see
-+nominal performance in `AMD CPPC Performance Capability <perf_cap_>`_).
-+This attribute is read-only.
-+
-+``amd_pstate_lowest_nonlinear_perf / amd_pstate_lowest_nonlinear_freq``
-+
-+The lowest non-linear CPPC performance and CPU frequency that the driver is
-+allowed to set in percent of the maximum supported CPPC performance level
-+(Please see the lowest non-linear performance in `AMD CPPC Performance
-+Capability <perf_cap_>`_).
-+This attribute is read-only.
-+
-+``amd_pstate_lowest_perf``
-+
-+The lowest physical CPPC performance. The minimum CPU frequency can be read
-+back from ``cpuinfo`` member of ``cpufreq_policy``, so we won't expose it
-+here.
-+This attribute is read-only.
-+
-+
-+``amd-pstate`` vs ``acpi-cpufreq``
-+======================================
-+
-+On majority of AMD platforms supported by ``acpi-cpufreq``, the ACPI tables
-+provided by the platform firmware used for CPU performance scaling, but
-+only provides 3 P-states on AMD processors.
-+However, on modern AMD APU and CPU series, it provides the collaborative
-+processor performance control according to ACPI protocol and customize this
-+for AMD platforms. That is fine-grain and continuous frequency range
-+instead of the legacy hardware P-states. ``amd-pstate`` is the kernel
-+module which supports the new AMD P-States mechanism on most of future AMD
-+platforms. The AMD P-States mechanism will be the more performance and energy
-+efficiency frequency management method on AMD processors.
-+
-+``cpupower`` tool support for ``amd-pstate``
-+===============================================
-+
-+``amd-pstate`` is supported on ``cpupower`` tool that can be used to dump the frequency
-+information. And it is in progress to support more and more operations for new
-+``amd-pstate`` module with this tool. ::
-+
-+ root@hr-test1:/home/ray# cpupower frequency-info
-+ analyzing CPU 0:
-+ driver: amd-pstate
-+ CPUs which run at the same hardware frequency: 0
-+ CPUs which need to have their frequency coordinated by software: 0
-+ maximum transition latency: 131 us
-+ hardware limits: 400 MHz - 4.68 GHz
-+ available cpufreq governors: ondemand conservative powersave userspace performance schedutil
-+ current policy: frequency should be within 400 MHz and 4.68 GHz.
-+ The governor "schedutil" may decide which speed to use
-+ within this range.
-+ current CPU frequency: Unable to call hardware
-+ current CPU frequency: 4.02 GHz (asserted by call to kernel)
-+ boost state support:
-+ Supported: yes
-+ Active: yes
-+ AMD PSTATE Highest Performance: 166. Maximum Frequency: 4.68 GHz.
-+ AMD PSTATE Nominal Performance: 117. Nominal Frequency: 3.30 GHz.
-+ AMD PSTATE Lowest Non-linear Performance: 39. Lowest Non-linear Frequency: 1.10 GHz.
-+ AMD PSTATE Lowest Performance: 15. Lowest Frequency: 400 MHz.
-+
-+
-+Diagnostics and Tuning
-+=======================
-+
-+Trace Events
-+--------------
-+
-+There are two static trace events that can be used for ``amd-pstate``
-+diagnostics. One of them is the cpu_frequency trace event generally used
-+by ``CPUFreq``, and the other one is the ``amd_pstate_perf`` trace event
-+specific to ``amd-pstate``. The following sequence of shell commands can
-+be used to enable them and see their output (if the kernel is generally
-+configured to support event tracing). ::
-+
-+ root@hr-test1:/home/ray# cd /sys/kernel/tracing/
-+ root@hr-test1:/sys/kernel/tracing# echo 1 > events/amd_cpu/enable
-+ root@hr-test1:/sys/kernel/tracing# cat trace
-+ # tracer: nop
-+ #
-+ # entries-in-buffer/entries-written: 47827/42233061 #P:2
-+ #
-+ # _-----=> irqs-off
-+ # / _----=> need-resched
-+ # | / _---=> hardirq/softirq
-+ # || / _--=> preempt-depth
-+ # ||| / delay
-+ # TASK-PID CPU# |||| TIMESTAMP FUNCTION
-+ # | | | |||| | |
-+ <idle>-0 [015] dN... 4995.979886: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=15 changed=false fast_switch=true
-+ <idle>-0 [007] d.h.. 4995.979893: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=7 changed=false fast_switch=true
-+ cat-2161 [000] d.... 4995.980841: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=0 changed=false fast_switch=true
-+ sshd-2125 [004] d.s.. 4995.980968: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=4 changed=false fast_switch=true
-+ <idle>-0 [007] d.s.. 4995.980968: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=7 changed=false fast_switch=true
-+ <idle>-0 [003] d.s.. 4995.980971: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=3 changed=false fast_switch=true
-+ <idle>-0 [011] d.s.. 4995.980996: amd_pstate_perf: amd_min_perf=85 amd_des_perf=85 amd_max_perf=166 cpu_id=11 changed=false fast_switch=true
-+
-+The cpu_frequency trace event will be triggered either by the ``schedutil`` scaling
-+governor (for the policies it is attached to), or by the ``CPUFreq`` core (for the
-+policies with other scaling governors).
-+
-+
-+Reference
-+===========
-+
-+.. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming,
-+ https://www.amd.com/system/files/TechDocs/24593.pdf
-+
-+.. [2] Advanced Configuration and Power Interface Specification,
-+ https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf
-+
-+.. [3] Processor Programming Reference (PPR) for AMD Family 19h Model 21h, Revision B0 Processors
-+ https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip
-+
-diff --git a/Documentation/admin-guide/pm/working-state.rst b/Documentation/admin-guide/pm/working-state.rst
-index f40994c422dc..5d2757e2de65 100644
---- a/Documentation/admin-guide/pm/working-state.rst
-+++ b/Documentation/admin-guide/pm/working-state.rst
-@@ -11,6 +11,7 @@ Working-State Power Management
- intel_idle
- cpufreq
- intel_pstate
-+ amd-pstate
- cpufreq_drivers
- intel_epb
- intel-speed-select
+--
+2.34.1